没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
Detection of anomalous trajectories is an important problem in the surveillance domain. Various algorithms based on learning of normal trajectory patterns have been proposed for this problem. Yet, these algorithms typically suffer from one or more limitations: They are not designed for sequential analysis of incomplete trajectories or online learning based on an incrementally updated training set. Moreover, they typically involve tuning of many parameters, including ad-hoc anomaly thresholds, and may therefore suffer from overfitting and poorly-calibrated alarm rates.
资源推荐
资源详情
资源评论
1222 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 26, NO. 3, JUNE 2018
On-Line Anomaly Detection With High Accuracy
Kun Xie , Member, IEEE, Xiaocan Li, Xin Wang, Member, IEEE, Jiannong Cao, Fellow, IEEE,
Gaogang Xie, Member, IEEE, Jigang Wen, Dafang Zhang, Member, IEEE,andZhengQin
Abstract—Traffic anomaly detection is critical for advanced
Internet management. Existing detection algorithms generally
convert the high-dimensional data to a long vector, which
compromises the detection accuracy due to the loss of spatial
information of data. Moreover, they are generally designed
based on the separation of normal and anomalous data in
a time period, which not only introduces high storage and
computation cost but also prevents timely detection of anom-
alies. Online and accurate traffic anomaly detection is critical
but difficult to support. To address the challenge, this paper
directly models the monitoring data in each time slot as a
2-D matrix, and detects anomalies in the new time slot based
on bilateral principal component analysis (B-PCA). We propose
several novel techniques in OnlineBPCA to support quick and
accurate anomaly detection in real time, including a novel B-
PCA-based anomaly detection principle that jointly considers
the variation of both row and column principal directions for
more accurate anomaly detection, an approximate algorithm
to avoid using iteration procedure to calculate the principal
directions in a close-form, and a sequential anomaly algorithm
to quickly update principal directions with low computation and
storage cost when receiving a new data matrix at a time slot.
To the best of our knowledge, this is the first work that exploits
2-D PCA for anomaly detection. We have conducted extensive
simulations to compare our OnlineBPCA with the state-of-art
Manuscript received April 26, 2017; revised November 24, 2017 and
March 5, 2018; accepted March 14, 2018; approved by IEEE/ACM
T
RANSACTIONS ON NETWORKING Editor C. W. Tan. Date of publication
April 26, 2018; date of current version June 14, 2018. This work was
supported in part by the National Natural Science Foundation of China under
Grant 61572184, Grant 61725206, Grant 61472130, Grant 61472131, and
Grant 61772191, in part by the Hunan Provincial Natural Science Foundation
of China under Grant 2017JJ1010, in part by the Science and Technology Key
Projects of Hunan Province under Grant 2015TP1004 and Grant 2016JC2012,
in part by the U.S. ONR under Grant N00014-17-1-2730, in part by the
NSF under Grant ECCS 1408247, Grant CNS 1526843, and Grant ECCS
1731238, and in part by the Open Project Funding of the CAS Key Lab-
oratory of Network Data Science and Technology, Institute of Computing
Technology, Chinese Academy of Sciences, under Grant CASNDST201704.
(Corresponding author: Kun Xie.)
K. Xie is with the College of Computer Science and Electronics Engineer-
ing, Hunan University, Changsha 410006, China, and also with the CAS Key
Laboratory of Network Data Science and Technology, Institute of Computing
Technology, Chinese Academy of Sciences, Beijing, China, and with the
Department of Electrical and Computer Engineering, The State University
of New York at Stony Brook, Stony Brook, NY 11794 USA (e-mail:
xiekun@hnu.edu.cn).
X. Li, D. Zhang, and Z. Qin are with the College of Computer Science and
Electronics Engineering, Hunan University, Changsha 410006, China (e-mail:
hnulxc@hnu.edu.cn; dfzhang@hnu.edu.cn; zqin@hnu.edu.cn).
X. Wang is with the Department of Electrical and Computer Engineering,
The State University of New York at Stony Brook, Stony Brook, NY
11794 USA (e-mail: x.wang@stonybrook.edu).
J. Cao is with the Department of Computing, The Hong Kong Polytechnic
University, Hong Kong (e-mail: csjcao@comp.polyu.edu.hk).
G. Xie and J. Wen are with the Network Research Center, Institute of
Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
(e-mail: xie@ict.ac.cn; wenjigang@ict.ac.cn).
Digital Object Identifier 10.1109/TNET.2018.2819507
anomaly detection algorithms using real traffic traces Abilene
and GÈANT. Our simulation results demonstrate that, compared
with other algorithms, our OnlineBPCA can achieve significantly
better detection performance with low false positive rate, high
true positive rate, and low computation cost.
Index Terms— Anomaly detection, on-line algorithm, bilateral
PCA.
I. INTRODUCTION
T
RAFFIC anomalies, caused by sources such as flash
crowds, denial-of-service attacks, port scans, and the
spreading of worms, can have detrimental effects on network
services. Detecting and diagnosing these anomalies are critical
to both network operators and end users.
Existing efforts [1]–[14] on anomaly detection usually
model the traffic monitoring data of a time slot as a vector
and use a traffic matrix to record the traffic monitoring data
of a period. In the example traffic matrix of Fig.1, each
row denotes an OD (origin-destination) pair and each column
denotes a time slot. As normal traffic data generally exhibit
strong spatio-temporal correlations [2], [8], [9], the normal
traffic matrix has low-rank. Moreover, as it is very costly for
an attacker to compromise a large number of OD pairs for
a long period of time, the anomalous data over time also
form a sparse matrix. Based on the observations, to detect
anomalies, existing studies usually separate the observed traffic
data into two parts, a low-rank normal data matrix and a sparse
outlier data matrix as shown in Fig.1. After the separation,
the anomalies are detected and located from the outlier part.
The techniques applied for anomaly detection based on
data separation include PCA [3], [5], [6], [8]–[12], Robust
PCA [14], [15], bilinear factor matrix norm minimization [16],
and recent Direct Robust Matrix Factorization (DRMF) [1],
[17]). Detecting anomalies generally based on off-line learn-
ing, these methods require storing all the monitoring data
within a period and operate on these data, which not only
introduces high storage and computation cost but also prevents
timely detection of anomalies.
It is essential to detect a sudden or unexpected change
of the traffic behavior as soon as possible. Although very
important, real-time anomaly detection is extremely difficult
to achieve. It requires a light-weight algorithm to accurately
and quickly identify whether the newly arriving data contain
anomalies or not. Different from data separation, there are very
limited studies on online anomaly detection. The work in [18]
attempts to check the variation of PCA transformation between
time slots to detect the anomaly. Although it is effective,
designed based on conventional PCA that only operates over a
vector of data, it still models the traffic data in each time slot
1063-6692 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
XIE et al.: ON-LINE ANOMALY DETECTION WITH HIGH ACCURACY 1223
Fig. 1. Traffic anomaly detection methods based data separation.
as a vector. For a network consisting of N nodes, there are
N ×N OD pairs, which will form a very long vector of the size
N ×N. This will in turn result in a large covariance matrix and
large computation cost in the process of PCA transformation.
Actually, traffic monitoring data captured in one time slot
can be naturally represented as an N × N matrix, with each
row denoting the transmission data from the same origin
node and each column denoting the transmission towards the
same destination. In existing anomaly detection approaches,
no matter through off-line data separation or online algorithm
of [18], the data from network measurements in one time slot
are usually modeled as a vector. The vector either directly
contains all OD pairs as in Fig.1, or is formed with the con-
catenation of rows or columns of an N ×N source-destination
matrix through “matrix-to-vector alignment”. Besides the com-
putation complexity resulted from a long vector, the spatial
information hidden in the traffic data will get lost when
representing data in a vector form, which will compromise
the performance of anomaly detection.
Rather than converting data from the matrix form to vec-
tor, directly using the matrix to represent traffic data in a
time slot for anomaly detection can better capture the data
relationship between rows and columns, which may help
increase the detection accuracy. To understand the possibility
and benefit, in this work, we propose to directly apply bilateral
PCA (B-PCA) over the data points corresponding to
two-dimensional data matrices for on-line anomaly detection.
Although there are some recent studies [19]–[22] on the direct
use of two-dimensional PCA over image data for feature
extraction, these methods for feature extraction cannot be
directly used for anomaly detection. Applying B-PCA for on-
line anomaly detection faces three major challenges:
• There lacks a principle to exploit the data features from
both rows and columns for anomaly detection;
• Different from conventional PCA, B-PCA does not have
close-form solutions for finding the projection matrices,
which makes it even harder to quickly detect the anomaly;
• Performing B-PCA operation over a huge set of historical
data is computationally expensive and time consuming.
Different from conventional algorithms for anomaly detec-
tion, we model the network monitoring data in each time slot
as a 2D matrix, and propose an online anomaly detection
scheme based on B-PCA (termed OnlineBPCA) to detect
whether newly arriving data contain anomalies or not. In
addition, to enable online anomaly detection, we propose a
set of algorithms for quick data processing. Our OnlineBPCA
includes the following set of novel techniques to support quick
and accurate anomaly detection:
• We propose a novel two-directional-change-based anom-
aly detection principle to detect whether newly arriving
data contain anomalies, where we check the changes
of the principal directions from both row side and col-
umn side. This is the first anomaly detection principle
proposed that allows B-PCA to be applied for anomaly
detection. As these two principal directions can more
comprehensively and accurately extract the features hid-
den in the monitoring data, our OnlineBPCA can achieve
a higher accuracy in detecting the anomaly than conven-
tional anomaly detection algorithms.
• We propose an approximate algorithm to calculate the
principal directions in a close form without using the
iteration procedure, which in turn provides the possibility
of designing a sequential algorithm for quickly detecting
anomalies online. Our simulation results demonstrate that
such an approximation does not decrease the detection
accuracy while significantly reducing the computation
cost.
• To quickly detect anomalous data, principal directions
need to be updated to adapt to the network changes in
real time. Unlike the batch methods which process all the
data together, we propose a sequential anomaly detection
algorithm that does not require the storage of the past
data and can update the principal directions using the
most recent monitoring data. As a result, our method is
fast and preferred for streaming data and on-line anomaly
detection.
• To amplify the impact of newly arriving data on the prin-
cipal directions of the monitoring data set, we propose a
novel strength method to duplicate the newly arriving
data multiple times. This would make it easier to find
anomalies even for a large data set.
Using traffic trace data Abilene [23] and GÈANT [24],
we implement ten anomaly detection algorithms to evalu-
ate our OnlineBPCA. Compared with other peer algorithms,
OnlineBPCA can detect whether newly arriving data contain
anomalies with much lower False Positive Rate and higher
True Positive Rate. Specifically, benefiting from our approx-
imate algorithm and sequential algorithm, OnlineBPCA can
accurately detect the anomaly with very fast speed.
The rest of this paper is organized as follows. Section II
presents the related work. We introduce the preliminary work
on B-PCA in Section III. We describe our anomaly detection
principle, approximate algorithm to find the projection matri-
ces, and sequential anomaly detection algorithm in Section IV,
Section V, and Section VI, respectively. Finally, we evaluate
the performance using real traffic trace data in Section VIII,
and conclude the work in Section IX.
II. R
ELATED WORK
Principal Component Analysis (PCA) [7] is perhaps the
best-known statistical analysis technique to achieve data sep-
aration for anomaly detection. PCA uses an orthogonal trans-
formation to convert possibly correlated observed variables
into a set of linearly uncorrelated variables (called principal
1224 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 26, NO. 3, JUNE 2018
components or principal directions). Some recent papers that
apply PCA to the traffic anomaly detection have shown some
promising initial results [3], [5], [6], [8]–[12]. The principal
components (PCs) are found and sorted in the order of
contribution to overall variance, with the PCs in the lower
dimensional and higher dimensional subspaces capturing the
dynamic properties of the system and noisy information,
respectively. As a result, to separate data, the PCs can be
divided into two sets. The traffic data mapped to principal
components (PCs) in the lower dimensional subspace are
normal, and the remaining data are the anomalies. For data
separation, all traffic data should be calculated and mapped
to the two subspaces, corresponding respectively to the two
types of PCs.
Although PCA-based data separation is effective when the
corruption is caused by small additive noise, recent study [13]
shows that traditional PCA-based approaches fail under the
large corruption, even if the corruption affects only very
few of the observations. To achieve better data separation,
recently, Candès et. al. [14] propose Robust PCA (RPCA)
which decomposes a given observation (noisy) matrix X into
a low-rank component X
and a sparse outlier component E.
To make the problem solvable, the work in [15] replaces
the matrix rank and the cardinality (
0
) functions with their
convex surrogates, the nuclear norm
∗
(i.e., the sum of its
singular values) and the L
1
norm
1
, and solves the following
convex optimization problem
min
X
,E
{X
∗
+ λE
1
}
st. X
+ E = X (1)
where λ is a positive weighting parameter. To decompose
the data into low-rank component and sparse component,
these methods resort to some relaxation techniques which may
largely impact the accuracy of anomaly detection.
To conquer the challenge in RPCA, work in [17] proposes
Direct Robust Matrix Factorization (DRMF) which directly
formulates the problem in its original way using the matrix
rank to represent the low rank feature of normal data matrix
and the L
0
-norm to represent the sparse feature of the anomaly
data. However, the solution involves the iterative execution of
SVD decomposition, which will bring very high computation
cost and is not scalable for large traffic data.
Shang et al. [16] observe two other issues of RPCA, which
leads the solution to be biased. That is, the use of nuclear
norm over penalizes large singular values, and the use of L
1
norm over penalizes large entries of the matrix. To address
the issues, the authors propose two bilinear factor matrix
norm minimization models for robust principal component
analysis. Specifically, the paper considers two specific l
p
-norm
minimization, with p =1/2 and p =2/3, respectively.
Although data separation is an effective way of detecting
anomalies appearing within a period, it is not suitable for
online detection. Data separation approaches usually work off-
line and require operating on the whole set of traffic data
captured in a period consisting of multiple time slots, which
consume a large amount of storage and long computation
time. Moreover, some data separation techniques such as
Fig. 2. Matrix-to-vector alignment.
RPCA and DRMF need a time-consuming iterative process to
separate the observed traffic data, which further increases the
computation cost. In contrast, our anomaly detection algorithm
aims to quickly and accurately detect whether traffic data
newly arriving in a time slot contain anomalies.
Very limited work [18] studies the on-line anomaly detec-
tion. Different from PCA-based algorithms [5], [6], [8]–[11]
[3], [12] that separate normal data and anomalous data by
mapping the traffic data into two types of PCs, [18] proposes
to detect whether the new data contain anomalies by checking
the impact on PCs from the newly arriving data. However,
as it is designed based on conventional PCA which can only
operate vector-form based data points, the traffic monitoring
data of each time slot in [18] is modeled as a vector like other
traffic anomaly detection algorithms, which suffer from the
loss of spatial data correlation and computation complexity.
As shown in Fig.2, the entries of first column denote the
traffic data from different source nodes to the same destination
node a, and are thus correlated. However, they are set far away
from each other in the vectorized representation. A column
anomaly can happen when a network of remotely controlled
and widely scattered Zombies or Botnet computers launch
DDoS attacks by simultaneously sending a large amount of
traffic and/or a large number of service requests to the target
system. Besides losing the spatial information and compro-
mising the anomaly detection, “matrix-to-vector alignment”
also leads the vector to be long, which will result in a
large size covariance matrix in [18]’s approach. Computing
the eigenvectors of a large covariance matrix is very time-
consuming.
Different from conventional PCA which can only deal with
vector data, a new technique called two-dimensional principal
component analysis (2DPCA) [19] was recently proposed in
the image field to handle 2D image, which directly computes
eigenvectors of the so-called covariance matrix without matrix-
to-vector conversion. Because the size of the covariance matrix
is equal to the width of images, which is quite small compared
with the size of a covariance matrix in PCA, 2DPCA evaluates
the image covariance matrix more accurately and computes the
corresponding eigenvectors more efficiently than PCA. Based
on tests over data from several databases [19], the accuracy
of face recognition using 2DPCA was found to be higher
than that using PCA, and the extraction of image features is
computationally more efficient.
However, studies in [26] show that 2DPCA is essentially
working in the row direction of images. If we apply 2DPCA
to anomaly detection, the information hidden in the data matrix
can not be fully utilized, so the detection accuracy will be still
low. To overcome the restrictions under 2DPCA, two linear
剩余13页未读,继续阅读
资源评论
fuwell
- 粉丝: 1
- 资源: 13
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功