1894 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 16, NO. 12, DECEMBER 2019
Vehicle Detection From High-Resolution Remote
Sensing Imagery Using Convolutional
Capsule Networks
Yongtao Yu , Member, IEEE, Tiannan Gu, Haiyan Guan , Senior Member, IEEE, Dilong Li, and Shenghua Jin
Abstract— Vehicle detection plays an important role in a vari-
ety of traffic-related applications. However, due to the scale and
orientation variations and partial occlusions of vehicles, it is still
challengeable to accurately detect vehicles from remote sensing
images. This letter proposes a convolutional capsule network for
detecting vehicles from high-resolution remote sensing images.
First, a test image is segmented into superpixels to generate
meaningful and nonredundant patches. Then, these patches are
input to a convolutional capsule network to label them into
vehicles or the background. Finally, nonmaximum suppression
is adopted to eliminate repetitive detections. Quantitative evalu-
ations on four test data sets show that average completeness,
correctness, quality, and F
1
-measure of 0.93, 0.97, 0.90, and
0.95, respectively, are obtained. Comparative studies with three
existing methods confirm that the proposed method effectively
performs in detecting vehicles of various conditions.
Index Terms— Convolutional capsule network, deep learn-
ing, remote sensing imagery, superpixel segmentation, vehicle
detection.
I. INTRODUCTION
T
RAFFIC monitoring is an important routine work. Peri-
odical and accurate traffic monitoring results can direct
transport agencies to conduct traffic controls and road plan-
ning. Traditional means for traffic monitoring are usually
performed through on-site surveillances or using traffic cam-
eras. Such means are labor-intensive, costly, and unsafe to
some extent, especially for monitoring large areas. With the
rapid d evelopment of optical remote sensing technologies, it is
quite convenient and cost-effective to capture high-resolution
remote sensing images using satellite sensors or unmann ed
aerial vehicles (UAV). Satellite sensors have large perspec-
tives. They can easily measure an extensive area of interest.
Manuscript receive d January 6, 2019; revised March 15, 2019; accepted
April 15, 2019. Date of publication May 8, 2019; date of current version
November 22, 2019. This work was supported by the Natural Science
Research in Colleges and Universities of Jiangsu Province under Grant
16KJB520006, by the National Natural Science Foundation of China under
Grants 61603146 and 41671454, and by the Natural Science Foundation
of Jiangsu Province under Grant BK20160427. (Corresponding author:
Yongtao Yu.)
Y. Yu, T. Gu, and S. Jin are with the Faculty of Computer and Software
Engineering, Huaiyin Institute of Technology, Huaian 223003, China (e-mail:
allennessy.yu@gmail.com; allennessy@hyit.edu.cn).
H. Guan is with the School of Remote Sensing and Geomatics Engineering,
Nanjing University of Information Science and Technology, Nanjing 210044,
China (e-mail: guanhy .nj@nuist.edu.cn; guanhy.nj@gmail.com).
D. Li is with the State Key Laboratory of Information Engineering in
Surve ying, Mapping, and Remote Sensing, Wuhan University, Wuhan 430072,
China (e-mail: scholar.dll@gmail.com).
Color versions of one or more of the figures in this letter are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/LGRS.2019.2912582
In contrast, UAV systems have the advantages of high porta-
bility and flying flexibility, as well as a low-cost platform.
Thus, due to the superior advantages of remote sensing
images, they have been widely used in a variety of traffic-
related applications. Consequently, extensive studies have also
been conducted for information extraction and interpretation
from remote sensing images, such as road extraction [1],
road feature extraction [2], vehicle detection [3], and traffic
monitoring [4].
Vehicle d etection from remote sensing images plays a
significant role in traffic-related applications. The location and
type information of vehicles can be applied to traffic flow con-
trol, road network planning, parking condition evaluation, and
so on. Existing methods for vehicle detection can be roughly
categorized into the implicit model and explicit model based
methods [5]. For implicit model based methods, great efforts
are paid to exploit local features of vehicles. Comparatively,
explicit model based methods usually use a box or a feature
model to characterize a vehicle. Although many achievements
have been made in th e literature, auto m ated and accurate
detection of vehicles from remote sensing images are still chal-
lenging because of appearance, scale, orientation variations,
partial occlusions, and image qualities.
In [6], local and global structure learning was adopted
to construct a pair of detectors. The root detector local-
ized potential vehicle regions, whereas the part detector was
scanned within the region to verify the existence of vehi-
cles. A binary detector trained with integral channel features
was proposed in [7]. In [8], aerial images were clipped
into semantic elements, which were input to a classifier to
detect vehicles. To solve the problem of insufficient labeled
samples, multiinstance discriminative learning [5 ] and transfer
learning [9] were also explored for vehicle detection. Sparse
representation was intro duced to direct distinct training sample
selection [10], [11]. In [12], the bag-of-words model with
spatial structure constraints was adopted to depict statistical
features of vehicles. In addition, vehicle detection by fus-
ing multisource data was also studied [13]. Recently, some
other methods, such as Viola–Jones [14], affine invariant
description [15], convolutional support vector machine (SVM)
networks [16], and so on, have also been proposed for vehicle
detection.
Due to the superior properties of deep-learning techniques
in mining high-order feature representations, vehicle detec-
tion based on deep-learning models has been intensively
studied. In [17], convolutional neural networks (CNNs) and
hard example mining were used to d etect vehicles. Similarly,
an iterative sample selection strategy was proposed in [18] to
1545-598X © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.