目标跟踪+siamese网络+论文与对应代码实现_siamese目标跟踪,siamese跟踪资源-CSDN文库

共165个文件

jpg：81个

py：49个

pyc：27个

网络

目标跟踪

毕业设计

需积分: 3 169 浏览量 2024-05-11 16:21:46 上传评论 1 收藏 2.99MB ZIP 举报

目标跟踪是计算机视觉领域中的一个核心任务，它涉及在连续的视频帧中定位并追踪特定对象。Siamese网络是一种有效的目标跟踪框架，通过对比学习的方式帮助系统在新帧中找到与初始帧中目标相似的区域。这篇描述提及的是一个项目，其中包含了目标跟踪的相关代码实现和基于Siamese网络的论文，对初学者来说是一个理想的入门资源。我们要理解Siamese网络的基本原理。这种网络结构由两个共享权重的分支组成，通常用于比较两个输入样本的相似性。在目标跟踪场景中，一个分支处理初始化的目标图像（模板），另一个分支处理当前帧的候选目标区域。通过计算两分支输出的相似度，可以确定候选区域是否包含目标。这种结构减少了在线训练的复杂性，使得目标跟踪更加高效。该项目可能包括以下内容： 1. 论文：项目可能包含一篇或多篇关于Siamese网络在目标跟踪中应用的研究论文，这些论文可能详细介绍了网络结构、损失函数、训练策略以及实验结果。阅读这些论文可以帮助我们理解最新的研究进展和技术细节。 2. 代码实现：代码实现部分可能涵盖了数据预处理、网络模型定义、训练流程、以及在测试视频上进行跟踪的推理部分。这通常会涉及到Python语言和深度学习框架，如TensorFlow或PyTorch。初学者可以通过分析这些代码了解目标跟踪的具体步骤，并学习如何将理论应用于实际问题。 3. TADT_rebuild：这个文件名可能是指一个重建或者改进的目标跟踪算法，可能是作者对原始TADT（Tracking Anytime Detection and Tracking）算法的优化版本。TADT是一种融合了检测和跟踪的框架，旨在提高跟踪的鲁棒性和准确性。学习这个项目，你可以掌握以下技能： - 理解Siamese网络的架构及其在目标跟踪中的应用。 - 掌握深度学习模型的构建和训练过程。 - 学习如何处理视频数据，包括帧的读取、预处理和特征提取。 - 了解如何将模型应用于实际跟踪任务，包括候选区域的生成和匹配策略。 - 分析和优化目标跟踪算法的性能。通过实践这个项目，初学者可以深入了解目标跟踪的挑战和解决方案，为后续的毕业设计或软件开发打下坚实基础。同时，这也有助于提升解决复杂计算机视觉问题的能力，对进入IT行业，特别是在人工智能和计算机视觉领域工作，具有极大的价值。

资源推荐

资源详情

资源评论

收起资源包目录

目标跟踪+siamese 网络+论文与对应代码实现（165个子文件）

.DS_Store 6KB

tadt_rebuild.iml 398B

0070.jpg 31KB

0063.jpg 31KB

0068.jpg 31KB

0069.jpg 31KB

0067.jpg 31KB

0064.jpg 31KB

0059.jpg 31KB

0060.jpg 31KB

0062.jpg 31KB

0065.jpg 31KB

0057.jpg 31KB

0066.jpg 31KB

0073.jpg 31KB

0058.jpg 31KB

0074.jpg 31KB

0076.jpg 31KB

0072.jpg 31KB

0061.jpg 31KB

0075.jpg 30KB

0078.jpg 30KB

0077.jpg 30KB

0055.jpg 30KB

0079.jpg 30KB

0054.jpg 30KB

0080.jpg 30KB

0071.jpg 30KB

0081.jpg 30KB

0053.jpg 30KB

0052.jpg 30KB

0056.jpg 30KB

0051.jpg 30KB

0050.jpg 30KB

0003.jpg 30KB

0048.jpg 30KB

0049.jpg 29KB

0002.jpg 29KB

0005.jpg 29KB

0006.jpg 29KB

0004.jpg 29KB

0046.jpg 29KB

0047.jpg 29KB

0009.jpg 29KB

0045.jpg 29KB

0010.jpg 29KB

0008.jpg 29KB

0007.jpg 29KB

0012.jpg 29KB

0013.jpg 29KB

0043.jpg 29KB

0044.jpg 29KB

0040.jpg 29KB

0039.jpg 29KB

0042.jpg 29KB

0014.jpg 29KB

0028.jpg 29KB

0011.jpg 29KB

0018.jpg 29KB

0029.jpg 29KB

0019.jpg 29KB

0030.jpg 29KB

0015.jpg 28KB

0041.jpg 28KB

0017.jpg 28KB

0031.jpg 28KB

0020.jpg 28KB

0016.jpg 28KB

0025.jpg 28KB

0021.jpg 28KB

0032.jpg 28KB

0033.jpg 28KB

0023.jpg 28KB

0035.jpg 28KB

0038.jpg 28KB

0034.jpg 28KB

0027.jpg 28KB

0024.jpg 28KB

0037.jpg 28KB

0026.jpg 28KB

0036.jpg 28KB

0022.jpg 28KB

0001.jpg 28KB

PENEt (18).pdf 483KB

taf.py 33KB

update_weight.py 20KB

tadt_changed_update_refind_incred.py 20KB

tadt_reg_loss_change.py 19KB

config.py 19KB

tadt_changed_update_refind.py 19KB

tadt2_mask.py 18KB

tadt_reg_weight_change.py 18KB

tadt_test_negative_relative.py 17KB

tadt_tracker_changed.py 16KB

tadt_chaged_rotation.py 15KB

backbone_v2.py 15KB

feature_utils_v2.py 13KB

PE.py 11KB

tests.py 10KB

tadt_tracker.py 10KB

共 165 条

PERCEPTION ENHANCED FRAME FOR VISUAL OBJECT TRACKING

BinpengSong

1,2

, JianfengLiu

1,2

, JianY e

1 Institute of Computing Technology, Chinese Academy of Sciences

2 University of Chinese Academy of Sciences

ABSTRACT

Deep trackers which based on pre-trained network trained on

object detection datasets, have shown great potentials in vi-

sual object tracking. However, the gap between object detec-

tion and object tracking is non-negligible. And the ﬁxed tem-

plate with the initial target feature during tracking in some

previous deep trackers greatly limit the performance of the

trackers. Therefore, we propose a perception enhanced frame

(PEF) to exploit the target-aware features which can better

recognize the target from background and update the template

features through response map. Our PEF tracker takes advan-

tage of the fully connected network with mask loss to select

target-aware feature channels, and updates the template to en-

hance the robustness, which enables our trackers to reduce the

deep features, enhances the discriminative ability, and ensures

the diversity of comparison template. Experimental results on

three popular datasets show that our method get superior per-

formance than the state-of-the-art trackers in terms of accu-

racy and speed.

Index Terms— feature selection, convolution network,

mask loss, template update, object tracking

1. INTRODUCTION

Visual object tracking has been a core task of computer vi-

sion, which is critical in many online real-time visual tracking

applications, such as intelligent transportation, video mon-

itoring, intelligent robot and so on. Object tracking is at-

tempting to capture the trajectory of a target in a sequence

of images when the target is given by a bounding box in ini-

tial frame. Traditional object tracking methods using origi-

nal color features or some manual features such as HOG and

Color Names [1, 2, 3, 4, 5], although guarantee real-time per-

formance, hardly meet the location accuracy [5, 6, 7]. Re-

cently, the visual trackers with convolution features have been

widely concerned [8, 9, 10]. And the performance have sig-

niﬁcantly improved due to the power of convolution feature

extraction.

Numerous popular deep trackers obtain inspiration from

object detection pre-trained network. Detection modules

might either improve the localization precision and get a bet-

ter discriminability against occlusions and background[11].

Fig. 1. Image (left) shows the target search area and target ob-

ject (red box), Heat map (middle) shows the conﬁdence map

of search area origin Siamese tracker using VGG-16 network,

Heat map (right) shows the conﬁdence map via our PEF.

The detection-based framework although achieve the state-of-

the-art performance[2, 7], the gap between object detection

and object tracking is non-negligible. First, object detection is

aimed to distinguish speciﬁc classes while object tracking is

supposed to track moving objects. Second, object detection is

unnecessary to differentiate intra-class instances while object

tracking not [11]. So the pre-trained network contains some

redundant convolution channels for object tracking, which

might do harm to target location and tracking efﬁciency. Be-

sides, we discover that ﬁxed template is used in former works

of object tracking. In the process of tracking, object might

transform caused by some actions or the perspective of obser-

vation changes, e.g. human has some different actions during

walking.

To address the above issues, we propose perception en-

hanced frame (PEF). In this work, our PEF is built upon

advanced deep detectors, Siamese matching network [8]. For

dealing with redundant channels from pre-trained model, our

PEF exploits the target-aware features which can better recog-

nize the target from background as shown in Figure 1. Instead

of using ﬁxed template, we incorporate the dynamic template

mechanism that updates the template features through the

feed of response map. In experiments, we evaluate the pro-

posed PEF tracker on three benchmark datasets and the result

demonstrates that the proposed PEF is effective to increase

the Siamese trackers in terms of accuracy and tracking speed.

The major contributions can be summarized as follows:

1) We propose the PEF to ﬁlter feature channels to reduce

the interference of redundant feature channels.

2) We develop a dynamic template updating strategy to

take the place of the ﬁxed template.

3) We evaluate the proposed method extensively on three

popular benchmarks. We show that the proposed tracker

achieves promising results under the real-time conditions.

2. PROPOSED METHOD

In this section, we present deep tracker based on PEF. We

ﬁrst introduce the Siamese frame and analyze the feasibility

of feature selection scheme. Then, we show the process of

feature channel selection by the perception enhanced module.

Finally, the template updating strategy will be showed.

2.1. Tracker based on Siamese matching framework

SiameseFC [8] is the most representative tracker based on

deep convolution network. And the tracking performance

mainly depends on the discriminative ability of the ofﬂine-

trained network. We adopt siamese framework as the basic

tracker and formulate the process. φ

and φ

are used to

model the feature extraction branch for the search region and

target template respectively. And we assume the movement

of target is smooth and the speed is perceptible between two

consecutive frames. Thus, we crop the search region X cen-

tered at position of target template Z in the last frame which

contains the target in the current frame if tracker can locate

the target accurately. The response map that determines the

target location is calculated as follows:

R = F

* F

(1)

Where F

= φ

(Z) and F

= φ

(X) are the features respec-

tively extracted from template Z and search region X, * means

cross correlation convolution. R denotes the response map,

and we can get target position according to the formulation:

P = argmax(R

)

(2)

Where R

is the result of response map R’s linear interpo-

lation, and the value in the response map indicates the conﬁ-

dence of its corresponding position to be the real target. So

the P where the max value locates is the estimated position of

the target.

According to further analysis, we ﬁnd that F

and F

have

the same number of feature channels. And the fact offers the

basis reﬁning the process of generating the response map, and

the response map R can be considered as:

R =

k=0

∗ F

)

(3)

Where n denotes the number of the feature channels, F

and

show the feature representation of the k-th feature channel

respectively obtained from the target template and the search

area.

The form of formula Eq.3 show the response map R can

be calculated by the addition of response maps from the con-

volution operator between the template and search region at

feature channel level. It provides the theoretical support for

further reﬁning the feature selection process and weighting

for different feature channels.

2.2. Perception Enhanced Module

Fig. 2. Full architecture of the proposed algorithm.

The issues brought by using the pre-trained network for

tracking are mainly from three aspects. First, the information

of target tracked is agnostic for pre-trained deep convolution

network, because the target may never appear in the ofﬂine

training data. Second, pre-trained networks focus on differ-

ences of inter-class and ignore intra-class differences. Third,

many feature channels have few contributions due to gap be-

tween the object tracking and object detection, and lead to

high computational complexity even over-ﬁtting problems.

In the view of the Eq.3, we construct perception enhanced

module as shown in Figure 2 to get robust features by mask

loss. This architecture of the tracker consists of a CNN fea-

ture backbone network, a perception enhanced module (PEM)

and a correlation matching module. In PEM, the tracker ﬁrst

decomposes the convolution features into single feature chan-

nels and generates many separate response maps, next uses

the fully connected network (FCN) via mask loss to ﬁlter the

appropriate features, ﬁnally computes the ultimate response

map and determines the position of the target by correlation

matching module. And the most important part of the process

is feature selection. Given a pre-trained deep network with

output feature space χ, the PEM can get more robust and dis-

评论收藏

内容反馈

taotaobujuerulv

粉丝: 190
资源: 3

目标跟踪+siamese 网络+论文与对应代码实现

跟踪领域比较火的Siamese系列代码

基于Object-C用于目标跟踪的全卷积暹罗网络

目标跟踪代码

多目标跟踪MOT 论文加代码

面向物联网机器视觉的目标跟踪方法设计与实现+毕业论文

在线目标跟踪算法设计与实现

Siamese孪生网络-完整代码-基于Tensorflow实现，已跑通

siamese网络TensorFlow测试代码

SiamFC孪生神经网络目标跟踪（python代码）

移动跟踪-基于YOLOv10+DeepSort实现视频中移动目标跟踪算法-附项目源码+流程教程-优质项目实战.zip

目标跟踪经典算法及代码

GOTURN-Tensorflow-master_目标跟踪_

使用YOLOv9+DeepSort实现的目标跟踪算法python源码.zip

目标跟踪 用MATLAB实现

张志鹏-CVPR2019基于siamese网络的单目标跟踪

38 使用 Python为 siamese网络构建图像对 python代码，完整案例，含数据

2018年ECCV会议所有目标跟踪方向投稿论文，以及少部分代码

Python-更深入更广泛的Siamese网络实时视觉跟踪

目标跟踪MATLAB、c/c++的代码实现

deep_sort_pytorch_deepsortpytorch_目标跟踪_多目标跟踪_

MeanShift目标追踪.cpp

多目标跟踪

opencv3/C++ 使用Tracker实现简单目标跟踪

基于matlab实现的目标跟踪

多目标视频跟踪代码.zip_图形用户界面_多目标跟踪_目标跟踪_目标跟踪 代码_视频跟踪

matlab-基于全卷积Fully-Convolutional-Siamese-Networks的目标跟踪仿真-源码

基于深度学习的目标视频跟踪算法综述.pdf

最新资源

目标跟踪用MATLAB实现

多目标视频跟踪代码.zip_图形用户界面_多目标跟踪_目标跟踪_目标跟踪代码_视频跟踪