IET Image Processing
Research Article
Multilevel framework to handle object
occlusions for real-time tracking
ISSN 1751-9659
Received on 17th March 2016
Revised 24th May 2016
Accepted on 17th June 2016
doi: 10.1049/iet-ipr
.2016.0176
www.ietdl.org
Yingfeng Cai
1
, Hai W
ang
2
, Xiaobo Chen
1
, Long Chen
1
1
Automotive Engineering Research Institute, Jiangsu University
, Zhenjiang 212013, People's Republic of China
2
School of Automotive and Traffic Engineering, Jiangsu University, Zhenjiang 212013, People's Republic of China
E-mail: wanghai1019@163.com
Abstract: This
study proposes an efficient method to handle the object occlusions seen in monocular traffic image sequences.
The motivation of this study is different methods perform differently in occlusion segmentation and the authors’ idea is to use a
situation-driven approach to aggregate different methods in order to get a good performance. This study classifies occlusion into
four categories according to the foreground situation and a multilevel occlusion handling framework is utilised. First, the image
segmentation algorithm based on convex hull analysis is utilised for intra-frame level occlusion segmentation. The segmentation
algorithm is established by the compactness ratio and interior distance ratio of the foreground. Second, an online sample-based
classification algorithm is utilised for tracking level occlusion segmentation. Training samples are extracted from the historical
frames before occlusion and testing samples are extracted from the current frame by an adaptive searching strategy. The
segmentation of occlusion is transferred into the online classification of testing samples. Such algorithm is established by the
similarity and coherence of target's property between continuous frames. Experiments on video sequences illustrate the good
performance of the proposed method under different conditions with low computational cost.
1 Introduction
In
video surveillance systems, accurate multi-target detection and
tracking is a key foundation of high level behaviour analysis and
event understanding [1]. It provides rich parameters such as target
classification, target trajectory, traffic flow, traffic density and lane
occupancy rate. Such information is useful to road users, which can
potentially reduce traffic congestion and enhance road safety [2, 3].
However, in monocular systems, the loss of depth information in
the projection of a 3D scene onto a 2D image plane may cause
occlusion between objects. In these cases, it is quite possible for
the trackers to miss the objects. That might further lead to the
wrong estimation of traffic parameters. Hence, the occlusion
detection and handling methods are usually applied to improve the
accuracy of multiple objects tracking [4–6].
1.1 Related work
The 3D/2.5D model has received great attention in occlusion
handling for a long time. Pang et al. [7] presented a 2.5D model to
handle vehicle occlusion. This paper added an axis which is
perpendicular to the lane vanishing point to the 2D imaging plane
and named the new model as a 2.5D model. The occluded vehicles
can be separated by contour searching even in the congested traffic
scene. Song and Nevatia [8] integrated 3D vehicle model, camera
calibration result, and the ground plane knowledge into a Bayesian
framework to detect and track occluded vehicles by searching a
maximum posteriori solution. Lou et al. [9] proposed an online
method to detect and eliminate occlusion by a 3D vehicle model.
The pose of the 3D vehicle model was decomposed into translation
and rotation. An improved extended Kalman filter was also
proposed to track and predict vehicle motion with a precise
kinematics model.
The feature model is a popular method to handle occlusion in
region-based tracking. Features such as Gabor, colour, edge and
corner are usually used to represent the object. Then, they are fed
into a classifier or reasoning model to recognise objects. Such
model works well when the matching features are still visible in the
occluded objects during tracking. Kanhere and Birchfield [10] used
a feature matching method to segment occlusions. The feature
points are detected and tracked through the image sequence. 3D
vehicles are reconstructed by these points and a relative height
constraint.
Gentile et al. [11] studied the relationship between
tracking performance and target's different features. According to
the relationship, the most influential features are selected and the
object is divided into some small blocks. By tracking these blocks,
the occluded target can be tracked successfully. Zhu et al. [12]
proposed an online sample-based occlusion handling framework
based on the similarity of the local colour, texture and spatial
features in sequential frames. This paper is partly inspired by Zhu's
methods. Compared with Zhu's method, ours has a better
framework with lower computational cost.
The part-based model is recently focused by many researchers
to recognise objects. Combining with reasoning models or
grammar models, part-based model is efficient to handle partial
occlusion. Niknejad et al. [13] proposed a two-layer classifier
using the deformable part models (DPMs) and conditional random
field to detect occluded vehicles. Such framework works well in
urban environment. Li et al. proposed an AND-OR Graph method
[14] and an AND-OR graph and hybrid image templates method
[15] to detect front-view and rear-view vehicles. Li's methods work
well in congested traffic conditions, but the methods focus on two
views of vehicles and cannot deal with person-vehicle and inter-
person occlusions. Tian et al. [16] proposed a vehicle detection
algorithm by DPM and object detection grammars. The occlusion
is handled by the results of DPM and the specific occlusion
grammars. Like Li's methods, Tian's method only considers two
views of vehicles.
In addition, the statistical and reasoning models are also
proposed to solve the occlusion problem. Zhang et al. [17]
proposed a multilevel framework to handle vehicle occlusion.
Their framework consists of three levels: the intraframe,
interframe, and tracking levels. Convex hull analysis method,
optical flow method and bidirectional reasoning method is used in
the corresponding level to solve the occlusion problem. These
methods can deal with moving vehicles, but it might fail for
stopping ones in the congested traffic scene. Jung et al. [5];
Veeraraghavan et al. [18] used occlusion-reasoning methods to
detect and separate occlusions based on a priori knowledge
trajectory matching. Huang and Liao [19] treated the object as
some moving masses and occlusions are separated by the
standardised variance difference of the mass's motion vector.
IET Image Process.
© The Institution of Engineering and T
echnology 2016
1