Comprehensive_survey_of_deep_learning_in_remote_sensing.pdf

版权申诉

deeplearning

remotesensing

5星 · 超过95%的资源 79 浏览量 2021-03-19 13:49:32 上传评论收藏 1.23MB PDF 举报

资源推荐

资源详情

资源评论

Comprehensive survey of deep

learning in remote sensing: theories,

tools, and challenges for the

community

John E. Ball

Derek T. Anderson

Chee Seng Chan

John E. Ball, Derek T. Anderson, Chee Seng Chan, “ Comprehensive survey of deep learning in remote

sensing: theories, tools, and challenges for the community,” J. Appl. Remote Sens. 11(4),

042609 (2017), doi: 10.1117/1.JRS.11.042609.

Comprehensive survey of deep learning in remote

sensing: theories, tools, and challenges for

the community

John E. Ball,

* Derek T. Anderson,

and Chee Seng Chan

Mississippi State University, Department of Electrical and Computer Engineering,

Mississippi State, Mississippi, United States

University of Malaya, Faculty of Comput er Science and Information Technology,

Kuala Lumpur, Malaysia

Abstract. In recent years, deep learning (DL), a rebranding of neural networks (NNs), has risen

to the top in numerous areas, namely compu ter vision (CV), speech recognition, and natural

language processing. Whereas remote sensing (RS) possesses a number of unique challenges,

primarily related to sensors and applications, inevitably RS draws from many of the same

theories as CV, e.g., statistics, fusion, and machine learning, to name a few. This means that

the RS community should not only be aware of advancements such as DL, but also be leading

researchers in this area. Herein, we provide the most comprehensive survey of state-of-the-art

RS DL research. We also review recent new developments in the DL field that can be used in DL

for RS. Namely, we focus on theories, tools, and challenges for the RS community. Specifically,

we focus on unsolved challenges and opportunities as they relate to (i) inadequate data sets,

(ii) human-understandable solutions for modeling physical phenomena, (iii) big data, (iv) non-

traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral,

spatial, and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of

DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.

Published by SPIE under a Creative Commons Attribution 3.0 Unported License. Distribution or repro-

duction of this work in whole or in part requires full attribution of the original publication, including its

DOI. [DOI: 10.1117/1.JRS.11.042609]

Keywords: remote sensing; deep learning; hyperspectral; multispectral; big data; computer

vision.

Paper 170464SS received Jun. 3, 2017; accept ed for publication Aug. 16, 2017; published online

Sep. 23, 2017.

1 Introduction

In recent years, deep learning (DL) has led to leaps, versus incremental gain, in fields such as

computer vision (CV), speech recognition, and natural language processing, to name a few.

The irony is that DL, a surrogate for neural networks (NNs), is an age-old branch of artificial

intelligence that has been resurrected due to factors such as algorithmic advancements, high-

performance computing, and big data. The idea of DL is simple: the machine learns the features

and is usually very good at decision making (classification) versus a human manually designing

the system. The RS field draws from core theories such as physics, statistics, fusion, and machi ne

learning, to name a few. Therefore, the RS community should be aware of and at the leading edge

of advancements such as DL. The aim of this paper is to provide resources with respect to theory,

tools, and challenges for the RS community. Specifically, we focus on unsolved challenges and

opportunities as they relate to (i) inadequate data sets, (ii) human-understandable solutions for

modeling physical phenomena, (iii) big data, (iv) nontraditional heterogeneous data sources,

(v) DL architectures and learning algorithms for spectral, spatial, and temporal data, (vi) trans fer

learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry,

and (ix) training and optimizing the DL.

*Address all correspondence to: John E. Ball, E-mail: jeball@ece.msstate.edu

REVIEW

Journal of Applied Remote Sensing 042609-1 Oct–Dec 2017

•

Vol. 11(4)

Herein, RS is a technological challenge where objects or scenes are analyzed by remote

means. This definition includes the traditional remote sensing (RS) areas, such as satellite-

based and aerial imaging. However, RS also includes nontraditional areas, such as unmanned

aerial vehicles (UAVs), crowdsourcing (phone imagery, tweets , etc.), and advanced driver-as-

sistance systems (ADAS). These types of RS offer different types of data and have different

processing needs, and thus also come with new challenges to algorithms that analyze the

data. The contributions of this paper are as follows:

1. Thorough list of ch allenges and open probl ems in DL RS. We focus on unsolved chal-

lenges and opportunities as they relate to (i) inadequate data sets, (ii) human-understand-

able solutions for modeling physical phenomena, (iii) big data, (iv) nontraditional

heterogeneous data sources, (v) DL architectures and learning algorithms for spectral,

spatial, and temporal data, (vi) transfer learning, (vii) an improved theoretic al under-

standing of DL systems, (viii) high barriers to entry, and (ix) training and optimizing

the DL. These observations are based on surveying RS DL and feature learning (FL)

literature, as well as numerous RS survey papers. This topic is the majority of our

paper and is discussed in Sec. 4.

2. Thorough literature survey. Herein, we review 205 RS application papers and 57 survey

papers in RS and DL. In addition, many relevant DL papers are cited. Our work extends

the previous DL survey papers

1–3

to be more comprehensive. We also cluster DL

approaches into different application areas and provide detailed discussions of many

relevant papers in these areas in Sec. 3.

3. Detailed discu ssions of modifying DL architectures to tackle RS problems. We highlight

approaches in DL for RS, including new architectures, tools, and DL compo nents that

current RS researchers have implemented in DL. This is discussed in Sec. 4.5.

4. Overview of DL. For RS researchers not familiar with DL, Sec. 2 provides a high-level

overview of DL and lists many good references for interested readers to pursue.

5. DL tool list. Tools are a major enabler of DL, and we revie w the more popular DL tools.

We also list pros and cons of several of the most popular toolsets and provide a table sum-

marizing the tools, with references and links (refer to Table 1). For more details, see

Sec. 2.3.5.

6. Online summaries of RS data sets and DL RS papers reviewed. First, an extensive online

table with details about each DL RS paper we reviewed: sensor modalities, a compilation

of the data sets used, a summary of the main contribution, and references. Second,

a data set summary for all the DL RS papers analyzed in this paper is provided online.

It contains the data set name, a description, a URL (if one is available), and a list of

references. Since the literature review for this paper was so extensive, these tables

are too large to put in the main paper but are provided online for the readers’ benefit.

These tables are located at http://cs-chan.com/source/FADL/Online_Paper_Summary_

Table.pdf, and http://cs-chan.com/source/FADL/Online_Dataset_Summary_Table.pdf.

This paper is organized as follows. Section 2 discusses related work in CV. This section

contrasts deep and “shallow” learning, and summarizes DL architectures. The main reasons

for success of DL are also discussed in this section. Section 3 provides an overview of DL

in RS, highlighting DL approaches in many disparate areas of RS. Section 4 discusses the unique

challenges and open issues in applying DL to RS. Conclusions and recommendations are listed

in Sec. 5.

2 Related Work in CV

CV is a field of study that aims to achieve visual understanding through computer analysis of

imagery. Traditional (aka, classical) approaches are sometimes referred to as “shallow” nowa-

days because there are typically only a few processing stages, e.g., image denoising or enhance-

ment followed by feature extraction then classification, that connect the raw data to our final

decisions. Examples of “shallow learn ers” include support vector machines (SVMs), Gaussian

mixtures models, hidden Markov models, and conditional random fields. In contrast, DL usually

Ball, Anderson, and Chan: Comprehensive survey of deep learning in remote sensing: theories. . .

Journal of Applied Remote Sensing 042609-2 Oct–Dec 2017

•

Vol. 11(4)

has many layers—the exact demarcation between “shallow” and “deep” learning is not a set

number (akin to multi- and hyperspectral signals)—which allows a rich variety of complex,

nonlinear, and hierarchical features to be learned from the data. The following sections contrast

deep and shallow learning, discuss DL approaches and DL enablers, and finally discuss DL

success in domains outside RS. Overall, the challenge of human-engineered solutions is the

manual or experimental discovery of which feature(s) and classifier satisfy the task at hand.

The challenge of DL is to define the appropriate network topology and subsequently optimizing

its hyperparameters.

2.1 Traditional Feature Learning Methods

Traditional methods of feature extraction involve hand-coded transforms that extract information

based on spatial, spectral, textural, morphological, and other cues. Examples are discussed in

detail in the following references; we do not give extensive algorithmic details herein.

Cheng et al.

discuss traditional hand-crafted features such as the histogram of ordered gra-

dients (HOG), which is a feature of the scale-invariant feature transform (SIFT), color histo-

grams, local binary patterns (LBP), etc. They also discuss unsupervised FL methods, such

as k-means clustering and sparse coding. Ot her g ood survey papers discuss hyperspectral

image (HSI) feature analysis,

kernel-based methods,

statistical learning methods in HSI,

spec-

tral distance functions,

pedestrian detection,

multiclassifier systems,

spectral–spatial classifi-

cation,

change detection,

11,12

machine learning in RS,

manifold learning,

endmember

extraction,

and spectral unmixing.

16–20

Traditional FL methods can work quite well, but (1) they require a high level of expertise and

very specific domain knowledge to create the hand-crafted features, (2) sometimes the proposed

solutions are fragile, that is, they work well with the data being analyzed but do not perform well

on data, (3) sophisticated methods may be required to properly handle irregular or complicated

decision surfaces, and (4) shall ow systems that learn hierarchical features can become very

complex.

In contrast, DL approaches (1) learn from the data itself, which means the expertise

for feature engineering is replaced (partially or comp let el y) by the DL, (2) DL has sta t e-

of-the-art results in many domains (and these results are usually significantly better then shal-

low approaches), and (3) DL in some instances can outperform humans and human-coded

features.

However, there are also considerations when adopting DL approaches: (1) many DL systems

have a large number of parameters, and require a significant amount of training data; (2) choosing

the optimal architecture and training it optimally are still open questions in the DL community;

(3) there is still a steep learning curve if one wants to really understand the math and opera-

tions of the DL systems; (4) it is hard to comprehend what is going on “under the hood” of

DL systems, (5) adapting very successful DL architectures to fit RS imagery analysis can be

challenging.

2.2 DL Approaches

To date, the autoencoder (AE), convolutional neural network (CNN), deep belief networks

(DBNs), and recurrent NN (RNN) have been the four mainstream DL architectures. Of

these architectures, the CNN is the most popular and most published to date. The deconvolu-

tional NN (DeconvNet)

21,22

is a relative newcomer to the DL community. The following sections

discuss each of these architectures at a high level. Many references are provided for the interested

reader.

2.2.1 Autoencoder

An AE is an NN that is used for unsupervised learning of efficient codings (from unlabeled data).

The AE’s codings often reveal useful features from unsupervised data. One of the first AE appli-

cations was dimensionality reduction, which is required in many RS applications. One advantage

of using an AE with RS data is that the data do not need to be labeled. In an AE, reducing the size

Ball, Anderson, and Chan: Comprehensive survey of deep learning in remote sensing: theories. . .

Journal of Applied Remote Sensing 042609-3 Oct–Dec 2017

•

Vol. 11(4)

of the adjacent layers forces the AE to learn a compact representation of the data. The AE maps

the input through an encoder function f to generate an internal (latent) representation, or code, h.

The AE also has a decoder function, g that maps h to the output

x. Let an input vector to an AE be

x ∈ R

. In a simple one hidden layer case, the function h ¼ fðxÞ¼gðWx þ bÞ, where W is the

learned weight matrix and b is a bias vector. A decoder then maps the latent representation to a

reconstruction or approximation of the input via x

¼ f

ðW

x þ b

Þ, where W

and b

are the

decoding weight and bias, respectively. Usually, the encoding and decoding weight matrices are

tied (shared), so that W

¼ W

, where T is the matrix transpose operator.

In general, the AE is constrained, either through its architecture, or through a sparsity con-

straint (or both), to learn a useful mapping (but not the trivial identity mapping). A loss function

L measures how close the AE can reconstruct the output: L is a function of x and

x ¼ f

½fðxÞ.

For example, a commonly used loss function is the mean squared error, which penalizes the

approximated output from being different from the input, Lðx; x

Þ¼kx − x

A regularization function ΩðhÞ can also be added to the loss function to force a more sparse

solution. The regularization function can involve penalty terms for model complexity, model

prior information, penalizing based on derivatives, or penalties based on some other criteria

such as supervised classification results (reference Sec. 14.2 of Ref. 23). Regularization is typ-

ically used with a deep encoder, not a shallow one. Regularization encourages the AE to have

other properties than just reconstructing the input, such as making the representation sparse,

robust to noise, or constraining derivatives in the representation. Arpit et al.

show that denois-

ing and contractive AEs that also contain activations such as sigmoid or rectified linear units

(ReLUs), which are commonly found in many AEs, satisfy conditions sufficient to encourage

sparsity. A common practice is to add a weighted regularizing term to the optimization function,

such as the l

norm, khk

m¼1

jh½ij, where K is the dimensionality of h and λ is a term

which controls how much effect the regularization has on the optimization proces s. This opti-

mization can be solved using alternating optimization over W and h.

A denoising autoencoder (DAE) is an AE designed to remove noise from a signal or an

image. Chen et al.

developed an efficient DAE, which marginalizes the noise and has a com-

putationally efficient closed-form solution. To provide robustness, the system is trained using

additive Gaussian noise or binary masking noise (force some percentage of inputs to zero).

Many RS applications use an AE for denoising. Figure 1(a) shows an example of an AE.

The diabolo shape results in dimensionality reduction.

Fig. 1 Block diagrams of DL architectures: (a) AE, (b) CNN, (c) DBN, and (d) RNN.

Ball, Anderson, and Chan: Comprehensive survey of deep learning in remote sensing: theories. . .

Journal of Applied Remote Sensing 042609-4 Oct–Dec 2017

•

Vol. 11(4)

剩余54页未读，继续阅读

评论收藏

内容反馈

版权申诉

读书旅行

2021-08-04

用户下载后在一定时间内未进行评价，系统默认好评。

Fun_He

粉丝: 19
资源: 104

Comprehensive_survey_of_deep_learning_in_remote_sensing.pdf

Deep Learning in Remote Sensing A Review.pdf

DEEP LEARNING IN REMOTE SENSING.rar_deep learning_remote_remote

Remote_Sensing_Image_Fusion_With_Deep.pdf

RemoteSensing_Geophysics:我在博士学位研究期间在迈阿密大学开发的部分代码和算法

remote-sensing-deep-learning

A survey on deep learning in medical image analysis.pdf

object detection survey deep learning part.pdf

remote-sensing-2.rar_matlab 遥感_remote_remote sensing_遥感

ASS1.rar_data remote sensing_remote_remote sensing

RemoteSensing期刊word模板_endnote 模板 remote sensing,latex remote sensing 期刊模板

Fundamentals_of_Remote_Sensing(遥感原理).pdf

remotesensing-11-01184.pdf_remotesensing_

spectrum_sensing.zip_Spectrum_Spectrum_sensing_spectrum sensing_

82247462Hyperspectral-Remote-Sensing.rar_matlab高光谱_remote_remote

compressive sensing.zip_remote sensing_压缩感知 融合_小波融合_融合 压缩感知_遥感压缩

A Review on Deep Learning in UAV Remote Sensing.pdf

polarplot.nb_remotesensing_

遥感数据集仓库_『Remote_sensing_datasets_warehouse』_SS-Datas.zip

VCA.zip_VCA_remote sensing_像元分解_分类识别_遥感

Remote Sensing of Enviro.ens

Remote Sensing of Environment期刊投稿模板 word版

un.rar_MATLAB遥感_matlab遥感影像_remote sensing_影像_自动提取

Sampling_methods,_remote_sensing_and_GIS_multiresource_forest_inventory

001-92239_AN92239_Proximity_Sensing_with_CapSense.pdf

My_Implementations_remote_remotesensing_changedetection_

Remote Sensing 投稿模板 Latex

low_hightlvbo.rar_remote sensing_图像 低通滤波_图像高通滤波_遥感图像处理_遥感滤波

Matlab-based.rar_MATLAB遥感_matlab_matlab 遥感_remote sensing_遥感图像处理

最新资源

compressive sensing.zip_remote sensing_压缩感知融合_小波融合_融合压缩感知_遥感压缩

low_hightlvbo.rar_remote sensing_图像低通滤波_图像高通滤波_遥感图像处理_遥感滤波