基于Object-C用于目标跟踪的全卷积暹罗网络资源-CSDN文库

54 浏览量 2024-03-17 21:03:23 上传评论收藏 1.73MB PDF 举报

传统上，任意对象跟踪的问题是通过专门在线学习对象外观模型来解决的，使用视频本身作为唯一的训练数据。尽管这些方法取得了成功，但他们仅在线的方法本质上限制了他们可以学习的模型的丰富性。最近，已经进行了几次尝试来利用深度卷积网络的表达能力。然而，当事先不知道要跟踪的对象时，有必要在线执行随机梯度下降以调整网络的权重，这会严重影响系统的速度。在本文中，我们为基本跟踪算法配备了一个新的全卷积暹罗网络，该网络在ILSVRC15数据集上进行端到端训练，用于视频中的对象检测。我们的跟踪器以超过实时的帧速率运行，尽管极其简单，但在多个基准测试中实现了最先进的性能。 ### 基于Objective-C用于目标跟踪的全卷积暹罗网络 #### 概述在计算机视觉领域，目标跟踪是一项关键任务，特别是在视频监控、无人驾驶汽车、增强现实等应用场景中发挥着重要作用。传统的目标跟踪方法往往依赖于在线学习目标外观模型的方式，这种方法虽然能够较好地适应目标的变化，但其模型的复杂性和准确性受限于可用的数据量和实时计算资源。近年来，随着深度学习技术的发展，特别是深度卷积神经网络（Deep Convolutional Neural Networks, DCNNs）的引入，目标跟踪领域的研究取得了显著进展。 #### 全卷积暹罗网络（Fully-Convolutional Siamese Networks）本论文提出了一种基于全卷积暹罗网络的目标跟踪算法，旨在克服传统在线学习方法的局限性，并充分利用深度卷积网络的强大表达能力。暹罗网络（Siamese Networks）是一种特殊的神经网络结构，通常用于相似性学习任务中，如图像检索、人脸识别等。它由两个或多个共享权重的子网络组成，每个子网络接收不同的输入，并通过比较它们的输出来评估输入之间的相似性。 ##### 网络架构与训练 - **网络架构**：全卷积暹罗网络采用端到端的训练方式，在ILSVRC15数据集上进行预训练，用于视频中的目标检测任务。该网络包括两个分支，分别处理目标模板和搜索区域的图像。通过最小化两个分支输出特征图之间的距离来学习目标的表征。 - **训练过程**：网络在大规模数据集（如ILSVRC15）上预先训练，获取丰富的特征表示能力。然后，对于具体的跟踪任务，只需要提供目标模板图像，网络即可快速适应并跟踪目标，而无需进一步的在线训练或权重调整。 #### 实时性能与准确率 - **实时性能**：全卷积暹罗网络能够在保持高精度的同时，实现超过实时的帧速率运行。这对于实时应用至关重要，例如在自动驾驶系统中，需要即时响应以确保安全。 - **准确率**：尽管其实现极其简单，但该跟踪器在多个基准测试中展现了与现有最先进的目标跟踪方法相当甚至更优的表现。 #### 应用场景 - **视频监控**：在大型商场、机场等公共场所，实时识别和跟踪可疑行为对提高安全性至关重要。 - **无人驾驶**：在复杂的交通环境中，准确识别并跟踪行人、车辆等动态障碍物对于保证行车安全至关重要。 - **增强现实**：在AR应用中，精确跟踪用户的视线方向或手势动作有助于提升用户体验。 #### 结论基于全卷积暹罗网络的目标跟踪算法不仅解决了传统在线学习方法的局限性，而且实现了高速且准确的目标跟踪。该方法的成功表明，通过将深度学习技术和特定领域的知识相结合，可以在实际应用中获得显著的性能提升。未来的研究方向可能包括探索更高效的网络架构、改进特征提取以及更好地处理遮挡和背景干扰等问题。

资源推荐

资源详情

资源评论

Fully-Convolutional Siamese Networks

for Object Tracking

Luca Bertinetto

Jack Valmadre

Jo˜ao F. Henriques

Andrea Vedaldi Philip H. S. Torr

Department of Engineering Science, University of Oxford

{name.surname}@eng.ox.ac.uk

Abstract. The problem of arbitrary object tracking has traditionally

been tackled by learning a model of the object’s appearance exclusively

online, using as sole training data the video itself. Despite the success of

these methods, their online-only approach inherently limits the richness

of the model they can learn. Recently, several attempts have been made

to exploit the expressive power of deep convolutional networks. How-

ever, when the object to track is not known beforehand, it is necessary

to perform Stochastic Gradient Descent online to adapt the weights of

the network, severely compromising the speed of the system. In this pa-

per we equip a basic tracking algorithm with a novel fully-convolutional

Siamese network trained end-to-end on the ILSVRC15 dataset for object

detection in video. Our tracker operates at frame-rates beyond real-time

and, despite its extreme simplicity, achieves state-of-the-art performance

in multiple benchmarks.

Keywords: object-tracking, Siamese-network, similarity-learning, deep-

learning

1 Introduction

We consider the problem of tracking an arbitrary object in video, where the

object is identiﬁed solely by a rectangle in the ﬁrst frame. Since the algorithm

may be requested to track any arbitrary object, it is impossible to have already

gathered data and trained a speciﬁc detector.

For several years, the most successful paradigm for this scenario has been to

learn a model of the object’s appearance in an online fashion using examples ex-

tracted from the video itself [1]. This owes in large part to the demonstrated abil-

ity of methods like TLD [2], Struck [3] and KCF [4]. However, a clear deﬁciency of

using data derived exclusively from the current video is that only comparatively

simple models can be learnt. While other problems in computer vision have seen

an increasingly pervasive adoption of deep convolutional networks (conv-nets)

trained from large supervised datasets, the scarcity of supervised data and the

constraint of real-time operation prevent the naive application of deep learning

within this paradigm of learning a detector per video.

The ﬁrst two authors contributed equally, and are listed in alphabetical order.

arXiv:1606.09549v3 [cs.CV] 1 Dec 2021

2 L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, P. H. S. Torr

Several recent works have aimed to overcome this limitation using a pre-

trained deep conv-net that was learnt for a diﬀerent but related task. These

approaches either apply “shallow” methods (e.g. correlation ﬁlters) using the

network’s internal representation as features [5,6] or perform SGD (stochastic

gradient descent) to ﬁne-tune multiple layers of the network [7,8,9]. While the use

of shallow methods does not take full advantage of the beneﬁts of end-to-end

learning, methods that apply SGD during tracking to achieve state-of-the-art

results have not been able to operate in real-time.

We advocate an alternative approach in which a deep conv-net is trained to

address a more general similarity learning problem in an initial oﬄine phase,

and then this function is simply evaluated online during tracking. The key con-

tribution of this paper is to demonstrate that this approach achieves very com-

petitive performance in modern tracking benchmarks at speeds that far exceed

the frame-rate requirement. Speciﬁcally, we train a Siamese network to locate

an exemplar image within a larger search image. A further contribution is a

novel Siamese architecture that is fully-convolutional with respect to the search

image: dense and eﬃcient sliding-window evaluation is achieved with a bilinear

layer that computes the cross-correlation of its two inputs.

We posit that the similarity learning approach has gone relatively neglected

because the tracking community did not have access to vast labelled datasets.

In fact, until recently the available datasets comprised only a few hundred anno-

tated videos. However, we believe that the emergence of the ILSVRC dataset for

object detection in video [10] (henceforth ImageNet Video) makes it possible to

train such a model. Furthermore, the fairness of training and testing deep models

for tracking using videos from the same domain is a point of controversy, as it

has been recently prohibited by the VOT committee. We show that our model

generalizes from the ImageNet Video domain to the ALOV/OTB/VOT [1,11,12]

domain, enabling the videos of tracking benchmarks to be reserved for testing

purposes.

2 Deep similarity learning for tracking

Learning to track arbitrary objects can be addressed using similarity learning.

We propose to learn a function f(z, x) that compares an exemplar image z to a

candidate image x of the same size and returns a high score if the two images

depict the same object and a low score otherwise. To ﬁnd the position of the

object in a new image, we can then exhaustively test all possible locations and

choose the candidate with the maximum similarity to the past appearance of the

object. In experiments, we will simply use the initial appearance of the object

as the exemplar. The function f will be learnt from a dataset of videos with

labelled object trajectories.

Given their widespread success in computer vision [13,14,15,16], we will use a

deep conv-net as the function f. Similarity learning with deep conv-nets is typ-

ically addressed using Siamese architectures [17,18,19]. Siamese networks apply

an identical transformation ϕ to both inputs and then combine their represen-

剩余15页未读，继续阅读

评论收藏

内容反馈

初心不忘产学研

粉丝: 9993
资源: 240

基于Object-C用于目标跟踪的全卷积暹罗网络

siamese-fc：使用完全卷积暹罗网络以50-100 FPS进行任意对象跟踪

matlabfilter代码-siam:暹

Offline-Signature-Verification-using-Siamese-Network:使用在Keras中实现的卷积暹罗网络识别伪造签名

基于融合卷积神经网络模型的手写数字识别.pdf

matlab最简单的代码-siamese-mnist:MNISTMatConvNet的暹罗示例

SiameseTracker-Deep:使用暹罗语进行多对象跟踪

couscous:使用Theano进行表示学习的暹罗神经网络

用暹罗神经网络替换YOLOv4，以识别两幅图像之间的差异，这可用于构建和学习暹罗-YOLOv4.zip

Siamese-ResNet：基于暹罗网络实现环路闭合检测

这是一个可用于图像相似性比较的暹罗网络库_ Siamese keras.zip

这是一个可用于图像相似性比较的暹罗网络库_ Siamese pytorch.zip

siamese-networks-omniglot-pytorch:使用PyTorch实施暹罗网络

matlab人脸识别代码-SSIAM:自我监督的暹罗网络（SSiam），FG2019

chainer-siamese:使用Chainer的暹罗网络实施

双LSTM神经网络-暹罗LSTM.zip

Semi-Siamese-Training:“半暹罗浅脸学习培训”

metric-learning-siamese-nn:公制学习暹罗

STM32CubeMx6.4.0版本+JRE安装包

【0积分免费下载】嵌入式各类项目大全100G

Keil 找不到编译器 Missing:Complier Version5 的解决方法

ser2pl64.sys是电脑重要系统文件，主要用于串口转USB，描述：USB-to-Serial Cable Driver

KEIL5 常用stm32芯片包下载

嵌入式入门-ADS-安装包

Arm Compiler 5.06 update 7 (build 960) Lin32 -“官网最新版”Arm处理器的编译工具

Keil5.33的STLink文件

Compiler Version 5编译器

RK3588全套硬件设计资料

最新资源