【CVPR2017】PersonRe-ID资源-CSDN文库

共14个文件

pdf：14个

CVPR2017

需积分: 9 56 浏览量 2018-06-12 11:28:40 上传评论收藏 20.19MB RAR 举报

【CVPR2017】Person Re-ID是一个在计算机视觉领域的重要研究主题，主要关注的是在不同的摄像头视角下对个体进行识别。这项技术的核心在于解决跨摄像头的人体重识别问题，即在一个监控摄像头中捕捉到某个人后，如何在其他摄像头的画面中找到同一个人，即使他们的姿态、光照、背景等条件都发生了变化。在提供的压缩包文件中，我们可以看到多个相关的研究论文，每一篇都针对Person Re-ID的不同方面进行了深入探讨： 1. 《【CVPR2017】Person Re-identification in the Wild.pdf》：这篇论文可能聚焦于在复杂、未受控环境下的Person Re-ID，即"在野外"的重识别问题。在实际应用中，这种环境充满了挑战，包括光照、遮挡、姿势变化等，因此，该研究可能提出了新的算法或方法来应对这些挑战。 2. 《【CVPR2017】One-Shot-Metric-Learning-for-Person-Re-identification-Paper.pdf》：标题暗示了研究者可能使用了一种名为“一-shot学习”的方法。一-shot学习是一种强化学习策略，它尝试通过少量的示例（在这里可能只有一个）来学习识别新的个体。这种方法对于有限的数据集特别有用，可以提高模型的泛化能力。 3. 《【CVPR2017】Beyond triplet loss.pdf》：Triplet损失是深度学习中常用的一种损失函数，用于度量样本之间的相似性。该论文可能提出了一种超越传统triplet损失的新方法，以改善模型在区分相似个体时的表现。 4. 《【CVPR2017】Joint Detection and Identification Feature Learning for Person Search.pdf》：此研究可能关注的是检测和识别的联合学习，即将目标检测与个体识别融合到一个框架中，以优化整个系统的性能。 5. 《【CVPR2017】Re-ranking Person Re-identification with k-reciprocal Encoding.pdf》：k-互逆编码是一种改进重排序的方法，它可能提高了初始匹配结果的准确性，从而提升了整体的重识别效果。 6. 《【CVPR2017】See the Forest for the Trees.pdf》：这可能是用一种新颖的视角来处理问题，强调全局理解的重要性，而不仅仅是关注局部特征。 7. 《【CVPR2017】Quality Aware Network for Set to Set Recognition.pdf》：质量感知网络可能关注于如何在处理集合到集合的识别任务时，考虑不同数据的质量差异，从而做出更准确的决策。 8. 《【CVPR2017】Multiple People Tracking by Lifted Multicut and Person Re-identification.pdf》：多目标跟踪是Person Re-ID的一个延伸应用，这篇论文可能介绍了结合了Lifted Multicut优化算法的多人追踪方法。 9. 《【CVPR2017】Fast Person Re-Identification via Cross-Camera Semantic Binary Transformation.pdf》：快速的跨摄像头语义二进制转换可能提供了一种高效的算法，使Person Re-ID能在实时环境中实现。 10. 《【CVPR】Spindle Net.pdf》：虽然不是CVPR2017的论文，但"Spindle Net"可能是一种专门设计用于人体关键点检测的网络结构，这对于Person Re-ID来说是基础性的。这些论文展示了Person Re-ID领域的最新进展和创新，涵盖了从特征学习、损失函数改进、检测与识别的联合学习，到跟踪和重排序策略等多个关键方面。阅读并理解这些论文，将有助于深入理解这个领域的前沿技术和挑战。

资源推荐

资源详情

资源评论

收起资源包目录

【CVPR2017】Person Re-ID.rar （14个子文件）

【CVPR2017】Re-ranking Person Re-identification with k-reciprocal Encoding.pdf 1.41MB

【CVPR2017】Joint Detection and Identification Feature Learning for Person Search.pdf 2.09MB

【CVPR2017】Learning Deep Context-aware Features over Body and Latent Parts for Person Reidentification.pdf 312KB

新建文件夹

【CVPR2017】Beyond triplet loss.pdf 2.26MB

【CVPR2017】Person Re-identification in the Wild.pdf 6.44MB

【CVPR2017】See the Forest for the Trees.pdf 1.29MB

【CVPR2017】Multiple People Tracking by Lifted Multicut and Person Re-identification.pdf 1.08MB

【CVPR】Spindle Net.pdf 979KB

【CVPR2017】Point to Set Similarity Based Deep Feature Learning.pdf 811KB

【CVPR2017】Fast Person Re-Identification via Cross-Camera Semantic Binary Transformation.pdf 999KB

【CVPR2017】Scalable Person Re-identification on Supervised Smoothed Manifold.pdf 610KB

【CVPR2017】One-Shot-Metric-Learning-for-Person-Re-identification-Paper.pdf 5.21MB

【CVPR2017】Quality Aware Network for Set to Set Recognition.pdf 1.25MB

【CVPR2017】Consistent-Aware Deep Learning for Person Re-Identification in a Camera Network.pdf 678KB

Person Re-identiﬁcation in the Wild

Liang Zheng

, Hengheng Zhang

, Shaoyan Sun

, Manmohan Chandraker

, Yi Yang

, Qi Tian

University of Technology Sydney

UTSA

USTC

UCSD

{liangzheng06,manu.chandraker,yee.i.yang,wywqtian}@gmail.com

Abstract

This paper

presents a novel large-scale dataset and com-

prehensive baselines for end-to-end pedestrian detection and

person recognition in raw video frames. Our baselines ad-

dress three issues: the performance of various combinations

of detectors and recognizers, mechanisms for pedestrian

detection to help improve overall re-identiﬁcation (re-ID)

accuracy and assessing the effectiveness of different detec-

tors for re-ID. We make three distinct contributions. First,

a new dataset, PRW, is introduced to evaluate

erson

identiﬁcation in the

ild, using videos acquired through

six near-synchronized cameras. It contains 932 identities

and 11,816 frames in which pedestrians are annotated with

their bounding box positions and identities. Extensive bench-

marking results are presented on this dataset. Second, we

show that pedestrian detection aids re-ID through two simple

yet effective improvements: a cascaded ﬁne-tuning strategy

that trains a detection model ﬁrst and then the classiﬁca-

tion model, and a Conﬁdence Weighted Similarity (CWS)

metric that incorporates detection scores into similarity mea-

surement. Third, we derive insights in evaluating detector

performance for the particular scenario of accurate person

re-ID.

1. Introduction

Automated entry and retail systems at theme parks, pas-

senger ﬂow monitoring at airports, behavior analysis for

automated driving and surveillance are a few applications

where detection and recognition of persons across a cam-

era network can provide critical insights. Yet, these two

problems have generally been studied in isolation within

computer vision. Person re-identiﬁcation (re-ID) aims to

ﬁnd occurrences of a query person ID in a video sequence,

L. Zheng, H. Zhang and S. Sun contribute equally. This work was partially sup-

ported by the Google Faculty Award and the Data to Decisions Cooperative Research

Centre. This work was supported in part to Dr. Qi Tian by ARO grant W911NF-

15-1-0290 and Faculty Research Gift Awards by NEC Laboratories of America and

Blippar. This work was supported in part by National Science Foundation of China

(NSFC) 61429201. Project page: http://www.liangzheng.com.cn

Raw$video$frames

Gallery

Detection$ result

Cam$2,$3,…

Cam$1

…

(a)$Pedestrian$Detection$

(b)$Person$Re-identification$

Figure 1: Pipeline of an end-to-end person re-ID system. It

consists of two modules: pedestrian detection and person

recognition (to differentiate from the overall re-ID). This pa-

per not only benchmarks both components, but also provides

novel insights in their interactions.

where state-of-the-art datasets and methods start from pre-

deﬁned bounding boxes, either hand-drawn [22, 25, 37] or

automatically detected [21, 45]. On the other hand, sev-

eral pedestrian detectors achieve remarkable performance on

benchmark datasets [12, 30], but little analysis is available

on how they can be used for person re-ID.

In this paper, we propose a dataset and baselines for practi-

cal person re-ID in the wild, which moves beyond sequential

application of detection and recognition. In particular, we

study three aspects of the problem that have not been con-

sidered in prior works. First, we analyze the effect of the

combination of various detection and recognition methods

on person re-ID accuracy. Second, we study whether detec-

tion can help improve re-ID accuracy and outline methods

to do so. Third, we study choices for detectors that allow for

maximal gains in re-ID accuracy.

Current datasets lack annotations for such combined eval-

uation of person detection and re-ID. Pedestrian detection

datasets, such as Caltech [10] or Inria [6], typically do not

have ID annotations, especially from multiple cameras. On

the other hand, person re-ID datasets, such as VIPeR [16]

or CUHK03 [21], usually provide just cropped bounding

boxes without the complete video frames, especially at a

large scale. As a consequence, a large-scale dataset that eval-

uates both detection and overall re-ID is needed. To address

this, Section 3 presents a novel large-scale dataset called

arXiv:1604.02531v2 [cs.CV] 6 Apr 2017

PRW that consists of

932

identities, with bounding boxes

across

11, 816

frames. The dataset comes with annotations

and extensive baselines to evaluate the impacts of detection

and recognition methods on person re-ID accuracy.

In Section 4, we leverage the volume of the PRW dataset

to train state-of-the-art detectors such as R-CNN [15], with

various convolutional neural network (CNN) architectures

such as AlexNet [19], VGGNet [31] and ResidualNet [17].

Several well-known descriptors and distance metrics are also

considered for person re-ID. However, our joint setup al-

lows two further improvements in Section 4.2. First, we

propose a cascaded ﬁne-tuning strategy to make full use of

the detection data provided by PRW, which results in im-

proved CNN embeddings. Two CNN variants, are derived

w.r.t the ﬁne tuning strategies. Novel insights can be learned

from the new ﬁne-tuning method. Second, we propose a

Conﬁdence Weighted Similarity (CWS) metric that incor-

porates detection scores. Assigning lower weights to false

positive detections prevents a drop in re-ID accuracy due to

the increase in gallery size with the use of detectors.

Given a dataset like PRW that allows simultaneous eval-

uation of detection and re-ID, it is natural to consider

whether any complementarity exists between the two tasks.

For a particular re-ID method, it is intuitive that a bet-

ter detector should yield better accuracy. But we argue

that the criteria for determining a detector as better are

application-dependent. Previous works in pedestrian de-

tection [10, 28, 43] usually use Average Precision or Log-

Average Miss Rate under IoU

> 0.5

for evaluation. How-

ever, through extensive benchmarking on the proposed PRW

dataset, we ﬁnd in Section 5 that IoU > 0.7 is a more effec-

tive rule in indicating detector inﬂuences on re-ID accuracy.

In other words, the localization ability of detectors plays a

critical role in re-ID.

Figure 1 presents the pipeline of the end-to-end re-ID

system discussed in this paper. Starting from raw video

frames, a gallery is created by pedestrian detectors. Given a

query person-of-interest, gallery bounding boxes are ranked

according to their similarity with the query. To summarize,

our main contributions are:

•

A novel large-scale dataset, Person Re-identiﬁcation in

the Wild (PRW), for simultaneous analysis of person

detection and re-ID.

•

Comprehensive benchmarking of state-of-the-art detec-

tion and recognition methods on the PRW dataset.

•

Novel insights into how detection aids re-ID, along with

an effective ﬁne-tuning strategy and similarity measure

to illustrate how they might be utilized.

•

Novel insights into the evaluation of pedestrian detec-

tors for the speciﬁc application of person re-ID.

Figure 2: Annotation interface. All appearing pedestrians

are annotated with a bounding box and ID. ID ranges from 1

to 932, and -2 stands for ambiguous persons.

2. Related Work

An overview of existing re-ID datasets.

In recent years,

a number of person re-ID datasets have been exposed [16,

20, 21, 44, 45, 48, 48]. Varying numbers of IDs and boxes

exist with them (see Table 1). Despite some differences

among them, a common property is that the pedestrians are

conﬁned within pre-deﬁned bounding boxes that are either

hand-drawn (e.g., VIPeR [16], iLIDS [48], CUHK02 [20])

or obtained using detectors (e.g., CUHK03 [21], Market-

1501 [45] and MARS [44]). PRW is a follow-up to our

previous releases [44, 45] and requires considering the entire

pipeline for person re-ID from scratch.

Pedestrian detection.

Recent pedestrian detection works

feature the “proposal+CNN” approach. Pedestrian detec-

tion usually employs weak pedestrian detectors as propos-

als, which allows achieving relatively high recall using very

few proposals [24, 27

–

29]. Despite the impressive recent

progress in pedestrian detection, it has been rarely consid-

ered with person re-ID as an application. This paper attempts

to determine how detection can help re-ID and provide in-

sights in assessing detector performance.

Person re-ID.

Recent progress in person re-ID mainly

consists in deep learning. Several works [1, 8, 21, 40, 44]

focus on learning features and metrics through the CNN

framework. Formulating person re-ID as a ranking task, im-

age pairs [1, 21, 40] or triplets [8] are fed into CNN. It is

also shown in [47] that deep learning using the identiﬁca-

tion model [35, 44, 50] yields even higher accuracy than the

siamese model. With a sufﬁcient amount of training data per

ID, we thus adopt the identiﬁcation model to learn an CNN

embedding in the pedestrian subspace. We refer readers to

our recent works [47, 50] for details.

Detection and re-ID.

In our knowledge, two previous

works focus on such end-to-end systems. In [42], persons in

photo albums are detected using poselets [4] and recognition

is performed using face and global signatures. However, the

setting in [42] is not typical for person re-ID where pedes-

trians are observed by surveillance cameras and faces are

not clear enough. In a work closer to ours, Xu et al. [39]

jointly model pedestrian commonness and uniqueness, and

calculate the similarity between query and each sliding win-

dow in a brute-force manner. While [39] works on datasets

Datasets PRW CAMPUS [38] EPFL [3] Market-1501 [45] RAiD [7] VIPeR [16] i-LIDS [48] CUHK03 [21]

#frame 11,816 214 80 - - - - -

#ID 932 74 30 1,501 43 632 119 1,360

#annotated box 34,304 1,519 294 25,259 6,920 1,264 476 13,164

#box per ID 36.8 20.5 9.8 19.9 160.9 2 2 9.7

#gallery box 100-500k 1,519 294 19,732 6,920 1,264 476 13,164

#camera 6 3 4 6 4 2 2 2

Table 1: Comparing PRW with existing image-based re-ID datasets [3,7, 16, 21, 38, 45, 48].

persons'w/'ID persons'w/o'ID Ba ckground

Detected'boxes

Figure 3: Examples of detected bounding boxes from video

frames in the PRW dataset. In “persons w

ID”, each column

contains 4 detected boxes of the same identity from distinc-

tive views. Column “persons w

o ID” presents persons who

do not have an ID in the dataset. Column “background”

shows false positive detection results. The detector used in

this ﬁgure is DPM + RCNN (AlexNet).

consisting of no more than 214 video frames, it may have

efﬁciency issues with large datasets. Departing from both

works, this paper sets up a large-scale benchmark system to

jointly analyze detection and re-ID performance.

Finally, we would like to refer readers to [36], concurrent

to ours and published in the same conference, which also

releases a large dataset for end-to-end person re-ID.

3. The PRW Dataset

3.1. Annotation Description

The videos are collected in Tsinghua university and are of

total length 10 hours. This aims to mimic the application in

which a person-of-interest goes out of the ﬁeld-of-view of the

current camera for a short duration and needs to be located

from nearly cameras. A total of 6 cameras were used, among

which 5 are 1080

1920 HD and 1 is 576

720 SD. The

video captured by each camera is annotated every 25 frames

(1 second in duration). We ﬁrst manually draw a bounding

box for all pedestrians who appear in the frames and then

assign an ID if it exists in the Market-1501 dataset. Since all

pedestrians are boxed, when we are not sure about a person’s

ID (ambiguity), we assign

−2

to it. These ambiguous boxes

are used in detector training and testing, but are excluded in

re-ID training and testing. Figure 2 and Figure 3 show the

annotation interface and sample detected boxes, respectively.

0 50 100 150 200 250 300 400 500 600 700

height (pixels)

500

1000

1500

2000

2500

3000

3500

counts

(a) height distribution

0.1 0.2 0.3 0.35 0.4 0.5 0.6 0.7 0.8

aspect ratio (w/h)

1000

2000

3000

4000

counts

(b) aspect ratio distribution

Figure 4: Distribution of pedestrian height and aspect ratio

(width/height) in the PRW dataset.

A total of 11,816 frames are manually annotated to obtain

43,110 pedestrian bounding boxes, among which 34,304

pedestrians are annotated with an ID ranging from 1 to 932

and the rest are assigned an ID of

−2

. In Table 1, we compare

PRW with previous person re-ID datasets regarding numbers

of frames, IDs, annotated boxes, annotated boxes per ID,

gallery boxes and number of cameras. Speciﬁcally, since

we densely label all the subjects, the number of boxes for

each identity is almost twice that of Market-1501. Moreover,

when forming the gallery, the detectors produce 100k-500k

boxes depending on the threshold. The distinctive feature

enabled by the PRW dataset is the end-to-end evaluation

of person re-ID systems. This dataset provides the original

video frames along with hand-drawn ground truth bounding

boxes, which makes it feasible to evaluate both pedestrian

detection and person re-ID. But more importantly, PRW

enables assessing the inﬂuence of pedestrian detection on

person re-ID, which is a topic of great interest for practical

applications but rarely considered in previous literature.

3.2. Evaluation Protocols

The PRW dataset is divided into a training set with

5, 704

frames and

482

IDs and a test set with

6, 112

frames and

450

IDs. We choose this split since it enables the minimum ID

overlap between training and testing sets. Detailed statistics

of the splits are presented in Table 2.

Pedestrian Detection.

A number of popular pedestrian

datasets exist, to name a few, INRIA [6], Caltech [10] and

KITTI [13]. The INRIA dataset contains 1,805 128

pedestrian images cropped from personal photos; the Caltech

dataset provides

∼

350k bounding boxes from

∼

132k frames;

the KITTI dataset has 80k labels for the pedestrian class.

With respect to the number of annotations, PRW (

∼

43k

评论收藏

内容反馈

laowei11

粉丝: 11
资源: 25

【CVPR2017】Person Re-ID

CVPR2017 Person Re-Id

CVPR2017论文

CVPR2017 PAPER

车辆Re-ID数据集（分角度）

Person_reID_baseline_pytorch-master.zip_Person Re-ID:_PyTorch行人识

matlab精度检验代码-Person-Re-Identification:重新编写用MATLAB编写的ID代码，以从两个不同的摄像头中检索两

2016CVPR Person Re-id

2017CVPR Person Re-id

2015CVPR Person Re-id

Re-ID Paper

2018_re-id.7z

Mixed High-Order Attention Network for Person Re-Identification.pdf

2021年 行人重识别最新——综述.pdf

行人再识别Person-reID的Pytorch实现-python

张志鹏-CVPR2019基于siamese网络的单目标跟踪

Learning-via-Translation:CVPR'18中的SPGAN

Person_reID_baseline_pytorch:Pytorch ReID

DG-Net:联合辨别和生成学习，以重新识别人。 CVPR'19（口服）

matlab代码先保存在运行-Recurrent-Convolutional-Video-ReID:本文的实现-基于视频的人员重新识别的递归卷

面向行人重识别的局部特征研究进展、挑战与展望.docx

国家开放大学计算机应用基础终结性考试（大作业）

离散数学知识点整理（超级全面详细！）

《科研伦理与学术规范》期末考试文档2（40题）

Word2Recite 桌面单词

2021全国及分省市县行政区划矢量图层shp文件.rar

38000词汇思维导图（1-50词根）β版.rar

MCGS组态精品版图库.zip

Revit 各版本官方族库及项目样板下载和安装方法，2016-2021族库离线包下载.rar

博士“申请-考核制”面试——英文提问问题/答案模板

最新资源

2021年行人重识别最新——综述.pdf