【免费】论文原文1资源-CSDN文库

需积分: 0 34 浏览量 2022-08-03 13:16:06 上传评论收藏 4.18MB PDF 举报

【网络入侵检测系统（NIDS）的挑战与Kitsune解决方案】随着计算机网络攻击的日益增多，网络入侵检测系统（NIDS）已成为保障网络安全的重要工具。NIDS的主要任务是监控网络流量，识别并报警异常行为，以防止恶意攻击。然而，传统的基于神经网络的NIDS存在一些显著的问题。训练神经网络需要大量的资源，这在许多网络网关和路由器设备上可能无法实现，因为这些设备的内存和处理能力有限。现有的神经网络模型通常依赖于监督学习，需要专家对网络流量进行标注，这既耗时又昂贵。针对上述挑战，论文“Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection”提出了Kitsune——一个无需监督、在线学习的NIDS。Kitsune的核心算法KitNET采用了自编码器的集成，能够集体区分正常和异常的网络流量模式。自编码器是一种特殊的神经网络，它通过学习数据的压缩和解压缩过程来发现数据的内在结构，从而识别出异常行为。 KitNET的优势在于其高效性和灵活性。由于Kitsune使用的是无监督学习，因此不需要预先标注的数据，它可以自动从本地网络流量中学习和适应正常行为，减少了人工干预的需求。KitNET结合了一个特征提取框架，该框架能够有效地追踪每个网络通道的流量模式，提高了检测效率。Kitsune在性能评估中表现出色，即使在资源有限的Raspberry PI上，也能达到与离线异常检测器相当的检测效果，证明了其在实际应用中的可行性和经济性。【自编码器在入侵检测中的应用】自编码器在Kitsune中扮演着关键角色。这种神经网络架构由编码器和解码器组成，编码器负责将输入数据压缩成低维表示，而解码器则尝试从这个压缩的表示重构原始输入。在训练过程中，自编码器试图最小化重构输入与原始输入之间的差异，以此学习数据的内在结构。在入侵检测场景中，当网络流量出现异常时，自编码器的重构误差会增大，从而触发警报。【集成学习提升性能】 Kitsune使用了一种集成学习策略，即多个自编码器的集合，进一步增强了检测能力。集成学习通过组合多个模型的预测，可以提高整体的稳定性和准确性。在Kitsune中，每个自编码器专注于学习不同方面或特征的网络流量，它们的综合判断能更全面地捕捉到异常行为。【在线学习与实时响应】 Kitsune的在线学习能力使得系统能够在网络环境中实时适应变化。这意味着它能够在新的攻击出现时迅速调整其模型，以保持对当前威胁的敏感性。这种特性对于应对不断演化的网络攻击至关重要。【总结】 Kitsune提供了一种创新的网络入侵检测方法，利用自编码器的集成和无监督学习，解决了传统NIDS在资源需求和人工标注上的问题。通过在有限资源的硬件上实现高效运行，Kitsune展示了其作为实用且经济的NIDS的潜力，为网络安全防护提供了新的思路。

资源详情

资源评论

资源推荐

Kitsune: An Ensemble of Autoencoders for Online

Network Intrusion Detection

Yisroel Mirsky, Tomer Doitshman, Yuval Elovici and Asaf Shabtai

Ben-Gurion University of the Negev

{yisroel, tomerdoi}@post.bgu.ac.il, {elovici, shabtaia}@bgu.ac.il

Abstract—Neural networks have become an increasingly popu-

lar solution for network intrusion detection systems (NIDS). Their

capability of learning complex patterns and behaviors make them

a suitable solution for differentiating between normal trafﬁc and

network attacks. However, a drawback of neural networks is

the amount of resources needed to train them. Many network

gateways and routers devices, which could potentially host an

NIDS, simply do not have the memory or processing power to

train and sometimes even execute such models. More importantly,

the existing neural network solutions are trained in a supervised

manner. Meaning that an expert must label the network trafﬁc

and update the model manually from time to time.

In this paper, we present Kitsune: a plug and play NIDS

which can learn to detect attacks on the local network, without

supervision, and in an efﬁcient online manner. Kitsune’s core

algorithm (KitNET) uses an ensemble of neural networks called

autoencoders to collectively differentiate between normal and

abnormal trafﬁc patterns. KitNET is supported by a feature

extraction framework which efﬁciently tracks the patterns of

every network channel. Our evaluations show that Kitsune can

detect various attacks with a performance comparable to ofﬂine

anomaly detectors, even on a Raspberry PI. This demonstrates

that Kitsune can be a practical and economic NIDS.

Keywords—Anomaly detection, network intrusion detection, on-

line algorithms, autoencoders, ensemble learning.

I. INTRODUCTION

The number of attacks on computer networks has been

increasing over the years [1]. A common security system used

to secure networks is a network intrusion detection system

(NIDS). An NIDS is a device or software which monitors all

trafﬁc passing a strategic point for malicious activities. When

such an activity is detected, an alert is generated, and sent to

the administrator. Conventionally an NIDS is deployed at a

single point, for example, at the Internet gateway. This point

deployment strategy can detect malicious trafﬁc entering and

leaving the network, but not malicious trafﬁc traversing the

network itself. To resolve this issue, a distributed deployment

strategy can be used, where a number of NIDSs are be

connected to a set of strategic routers and gateways within

the network.

Over the last decade many machine learning techniques

have been proposed to improve detection performance [2], [3],

[4]. One popular approach is to use an artiﬁcial neural network

(ANN) to perform the network trafﬁc inspection. The beneﬁt

of using an ANN is that ANNs are good at learning complex

non-linear concepts in the input data. This gives ANNs a

great advantage in detection performance with respect to other

machine learning algorithms [5], [2].

The prevalent approach to using an ANN as an NIDS is

to train it to classify network trafﬁc as being either normal

or some class of attack [6], [7], [8]. The following shows the

typical approach to using an ANN-based classiﬁer in a point

deployment strategy:

1) Have an expert collect a dataset containing both normal

trafﬁc and network attacks.

2) Train the ANN to classify the difference between normal

and attack trafﬁc, using a strong CPU or GPU.

3) Transfer a copy of the trained model to the net-

work/organization’s NIDS.

4) Have the NIDS execute the trained model on the observed

network trafﬁc.

In general, a distributed deployment strategy is only prac-

tical if the number of NIDSs can economically scale according

to the size of the network. One approach to achieve this goal

is to embed the NIDSs directly into inexpensive routers (i.e.,

with simple hardware). We argue that it is impractical to use

ANN-based classiﬁers with this approach for several reasons:

Ofﬂine Processing. In order to train a supervised model, all

labeled instances must be available locally. This is infeasible

on a simple network gateway since a single hour of trafﬁc may

contain millions of packets. Some works propose ofﬂoading

the data to a remote server for model training [9] [3]. However,

this solution may incur signiﬁcant network overhead, and does

not scale.

Supervised Learning. The labeling process takes time and is

expensive. More importantly, what is considered to be normal

depends on the local trafﬁc observed by the NIDS. Further-

more, in attacks change overtime and while new ones are

constantly being discovered [10], so continuous maintainable

of a malicious attack trafﬁc repository may be impractical.

Finally, classiﬁcation is a closed-world approach to identifying

concepts. In other words, a classiﬁer is trained to identify the

classes provided in the training set. However, it is unreasonable

to assume that all possible classes of malicious trafﬁc can be

collected and placed in the training data.

High Complexity. The computational complexity of an ANN

Permission to freely reproduce all or part of this paper for noncommercial

purposes is granted provided that copies bear this notice and the full citation

on the ﬁrst page. Reproduction for commercial purposes is strictly prohibited

without the prior written consent of the Internet Society, the ﬁrst-named author

(for reproduction of an entire paper only), and the author’s employer if the

paper was prepared within the scope of employment.

NDSS ’18, 18-21 February 2018, San Diego, CA, USA

http://dx.doi.org/10.14722/ndss.2018.23204

arXiv:1802.09089v2 [cs.CR] 27 May 2018

RMSE

Map

Ensemble Layer

Output Layer

…

score

Fig. 1: An illustration of Kitsune’s anomaly detection algo-

rithm KitNET.

grows exponentially with number of neurons [11]. This means

that an ANN which is deployed on a simple network gateway,

is restricted in terms of its architecture and number of input

features which it can use. This is especially problematic on

gateways which handle high velocity trafﬁc.

In light of the challenges listed above, we suggest that

the development of an ANN-based network intrusion detector,

which is to be deployed and trained on routers in a distributed

manner, should adhere to the following restrictions:

Online Processing. After the training or executing the model

with an instance, the instance is immediately discarded. In

practice, a small number of instances can be stored at any

given time, as done in stream clustering [12].

Unsupervised Learning. Labels, which indicate explicitly

whether a packet is malicious or benign, are not used in the

training process. Other meta information can be used so long

as acquiring the information does not delay the process.

Low Complexity. The packet processing rate must exceed the

expected maximum packet arrival rate. In other words, we

must ensure that there is no queue of packets awaiting to be

processed by the model.

In this paper, we present Kitsune: a novel ANN-based

NIDS which is online, unsupervised, and efﬁcient. A Kitsune,

in Japanese folklore, is a mythical fox-like creature that has a

number of tails, can mimic different forms, and whose strength

increases with experience. Similarly, Kitsune has an ensemble

of small neural networks (autoencoders), which are trained

to mimic (reconstruct) network trafﬁc patterns, and whose

performance incrementally improves overtime.

The architecture of Kitsune’s anomaly detection algorithm

(KitNET) is illustrated in Fig. 1. First, the features of an

instance are mapped to the visible neurons of the ensemble.

Next, each autoencoder attempts to reconstruct the instance’s

features, and computes the reconstruction error in terms of

root mean squared errors (RMSE). Finally, the RMSEs are

forwarded to an output autoencoder, which acts as a non-linear

voting mechanism for the ensemble. We note that while train-

ing Kitsune, no more than one instance is stored in memory at

a time. KitNET has one main parameter, which is the maximum

number of inputs for any given autoencoder in the ensemble.

This parameter is used to increase the algorithm’s speed with

a modest trade off in detection performance.

The reason we use autoencoders is because (1) they can

trained in an unsupervised manner, and (2) they can be used for

anomaly detection in the event of a poor reconstruction. The

reason we propose using an ensemble of small autoencoders,

is because they are more efﬁcient and can be less noisier than

a single autoencoder over the same feature space. From our

experiments, we found that Kitsune can increase the packet

processing rate by a factor of ﬁve, and provide a detection

performance which rivals other an ofﬂine (batch) anomaly

detectors.

In summary, the contributions of this paper as follows:

• A novel autoencoder-based NIDS for simple network de-

vices (Kitsune), which is lightweight and plug-and-play.

To the best of our knowledge, we are the ﬁrst to propose

the use of autoencoders with or without ensembles for

online anomaly detection in computer networks. We also

present the core algorithm (KitNET) as a generic online

unsupervised anomaly detection algorithm, and provide

the source code for download.

• A feature extraction framework for dynamically main-

taining and extracting implicit contextual features from

network trafﬁc. The framework has a small memory

footprint since the statistics are updated incrementally

over damped windows.

• An online technique for automatically constructing the

ensemble of autoencoders (i.e., mapping features to ANN

inputs) in an unsupervised manner. The method involves

the incremental hierarchal clustering of the feature-space

(transpose of the unbounded dataset), and bounding of

cluster sizes.

• Experimental results on an operational IP camera video

surveillance network, IoT network, and a wide variety of

attacks. We also demonstrate the algorithm’s efﬁciency,

and ability to run on a simple router, by performing

benchmarks on a Raspberry PI.

The rest of the paper is organized as follows: Section

II discusses related work in the domain of online anomaly

detection. Section III provide a background on autoencoders

and how they work. Section IV presents Kitsune’s framework

and it’s entire machine learning pipeline. Section V presents

experimental results in terms of detection performance and

run-time performance. Finally, in section VII we present our

conclusion.

II. RELATED WORK

The domain of using machine learning (speciﬁcally

anomaly detection) for implementing NIDSs was extensively

researched in the past [13], [14], [15], [16], [17]. However,

these solutions usually do not have any assumption on the

resources of the machine running training or executing the

model, and therefore are either too expensive to train and

execute on simple gateways, or require a labeled dataset to

perform the training process.

Several previous works have proposed online anomaly

detection mechanisms using different lightweight algorithms.

For example, the PAYL IDS which models simple histograms

The source code for KitNET is available for download at:

https://github.com/ymirsky/KitNET-py.

of packet content [18] or the kNN algorithm [19]. These

methods are either very simple and therefore produce very

poor results, or require accumulating data for the training or

detection.

A popular algorithm for network intrusion detection is

the ANN. This is because of its ability to learn complex

concepts, as well as the concepts from the domain of network

communication [17]. In [20], the authors evaluated the ANN,

among other classiﬁcation algorithms, in the task of network

intrusion detection, and proposed a solution based on an

ensemble of classiﬁers using connection-based features. In [8],

the authors presented a modiﬁcation to the back propagation

algorithm to increase the speed of an ANN’s training process.

In [7], the authors used multiple ANN-based classiﬁers, where

each one was trained to detect a speciﬁc type of attack. In [9],

the authors proposed a hierarchal method where each packet

ﬁrst passes through an anomaly detection model, then if an

anomaly is raised, the packet is evaluated by a set of ANN

classiﬁers where each classiﬁer is trained to detect a speciﬁc

attack type.

All of the aforementioned papers which use ANNs, are

either supervised, or are not suitable for a simple network

gateway. In addition, some of the works assume that the

training data can be stored and accumulated which is not the

case for simple network gateways. Our solution enables a plug-

and-play deployment which can operate at much faster speeds

than the aforementioned models.

With regards to the use of autoencoders: In [21], the

authors used an ensemble of deep neural networks to address

object tracking in the online setting. Their proposed method

uses a stacked denoising autoencoder (SDAE). Each layer of

the SDAE serves as a different feature space for the raw

image data. The scheme transforms each layer of the SDAE

to a deep neural network which is used as discriminative

binary classiﬁer. Although the authors apply autoencoders in

an online setting, they did not perform anomaly detection, nor

address the challenge of real-time processing (which is great

challenge with deep neural networks). Furthermore, training

a deep neural network is complex and cannot be practically

performed on a simple network device. In [22] and [23], the

authors propose the use of autoencoders to extract features

from datasets in order to improve the detection of cyber

threats. However, the autoencoders themselves were not used

for anomaly detection. Ultimately, the authors use classiﬁers

to detect the cyber threats. Therefore, their solution requires an

expert to label instances, whereas our solution is unsupervised,

and plug-and-play.

In [24], the authors proposed the generic use of an au-

toencoder for detecting anomalies. In [19], the authors use

autoencoders to detect anomalies in power grids. These works

differ from ours because (1) they are not online, (2) the archi-

tecture used by the authors is not lightweight and scalable as

an ensemble, and (3) has not been applied to network intrusion

detection. We note that part of this paper’s contribution is an

appropriate feature extraction framework, which enables the

use of autoencoders in the online network setting.

III. BACKGROUND: AUTOENCODERS

Autoencoders are the foundation building blocks of Kit-

sune. In this section we provide a brief introduction to au-

toencoders; what they are, and how they work. To describe

the training and execution of an auto encoder we will refer to

the example in Fig. 2.

  































  

󰇛󰇜

 

󰇛󰇜

  

󰇛󰇜

 

󰇛󰇜









 





where, 



 

󰇛󰇜



Learned Parameters:

Fig. 2: An example autoencoder with one compression layer,

which reconstructs instances with three features.

A. Artiﬁcial Neural Networks

ANNs are made up of layers of neurons, where each

layer is connected sequentially via synapses. The synapses

have associated weights which collectively deﬁne the concepts

learned by the model. Concretely, let l

(i)

denote the i-th layer

in the ANN, and let kl

(i)

k denote the number of neurons in l

(i)

Finally, let the total number of layers in the ANN be denoted as

L. The weights which connect l

(i)

to l

(i+1)

are denoted as the

(i)

k-by-kl

(i+1)

k matrix W

(i)

and kl

(i+1)

k dimensional bias

vector

(i)

. Finally, we denote the collection of all parameters

θ as the tuple θ ≡ (W, b), where W and b are the weights of

each layer respectively. Fig 2 illustrates how the weights form

the synapses of each layer in an ANN.

There are two kinds of layers in an ANN: visible layers

and hidden layers. The visible layer receives the input instance

~x with an additional bias variable (a constant value of 1). ~x

is a vector of numerical features which describes the instance,

and is typically normalized to fall out approximately on the

range of [−1, +1] (e.g., using 0-1 normalization or zscore

normalization)[25]. The difference between the visible layer

and the hidden layers is that the visible layer is considered to

be precomputed, and ready to be passed to the second layer.

B. Executing an ANN

To execute an ANN, l

(2)

is activated with the output of l

(1)

(i.e., ~x) weighted with W

(1)

, then the output l

(2)

weighted with

(2)

is used to activate l

(3)

, and so on until the ﬁnal layer has

been activated. This process is known as forward-propagation.

Let ~a

(i)

be the kl

(i)

k vector of outputs from the neurons in l

(i)

To obtain a

(i+1)

, we pass ~a

(i)

through l

(i+1)

by computing

(i+1)

= f



(i)

· ~a

(i)



(1)

where f is what’s known as the neuron’s activation function.

A common activation function, and what we use in Kitsune,

is the sigmoid function, deﬁned as

f (~x) =

1 + e

(2)

剩余14页未读，继续阅读

评论收藏

内容反馈

黄浦江畔的夏先生

粉丝: 18
资源: 299

论文原文1

评论0

最新资源

论文原文1

评论0

论文原文与解读1

英文原文1

英文翻译原文1

外文原文1

yolo，yolov2和yolov3的论文原文.zip

3DR2N论文原文

谷歌三大论文中文+原文

canny经典论文 原文

sift 算法论文原文

YOLOv1论文原文、YOLOv1推测过程详细PPT以及论文精度标注

ORB-SLAM2论文原文

图灵里程碑论文1950原文

英语6级原文

ChatGPT-4论文原文

毕业论文 外文原文及译文

GoogLeNet论文原文.pdf

ACFLSS论文原文

XGBoost论文原文+翻译

GraphMixup论文原文

【论文翻译】Fast R-CNN论文原文与中文翻译.rar

ORB_SLAM论文原文

R-CNN系列三篇论文英文原文

遗传算法（原文）

KCF/DCF英文论文原文，带注释哟

RNN变体——GRU网络论文原文

最新资源

canny经典论文原文

毕业论文外文原文及译文