知识图谱+网络安全
====
# Knowledge-Graph-Analyze
## 尝试1:
Bro和Snort的初步结果存入知识图谱,如网络包、网络底层事件等。知识图谱在这些数据的基础上进行分析。
问题:
具体是做什么样的分析呢,分析出什么结果?
老师提及的“多步攻击”?
知识图谱中的数据的存储格式是不是要作出改变,以适应“分析”的要求?
关于网络底层事件:
网络基本事件,Bro会生成很多日志文件,其中大多数是以协议的名称命名的(其内容基本是与该协议相关的流量内容)。但是也有比较特殊的日志文件,比如notice.log,我们可以定制该文件的内容(通过添加notice类型的方式),姑且认为notice.log文件中记录的内容就是所谓的网络基本事件。
conn.log中存放网络中连接的日志,其实连接建立也是一种事件,是不是被Bro整理为日志输出的内容,都属于事件的范畴?
关于网络包:
网络包应该是网络流量最原始的状态,没有经过上层分析。Snort在Packet Logger模式下,记录的就是网络数据包。
关于知识图谱的分析、推理功能:
参考《网络空间安全防御与态势感知》的第8章,要对网络中的事件坐初步的分析、推理需要一个”本体模型“,这里提及了OWL模型。所以,我们的数据是不是也需要经过一番处理,转换成OWL模型的数据,方便分析、推理呢?
关于知识图谱的存储:
我们目前将知识存储在MYSQL数据库中,这种传统的关系型数据的存储与知识图谱所需的语义存储相去甚远。考虑使用D2RQ将关系型数据转换为RDF表示的数据。
## 数据集选取
考虑DARPA的[LLS_DDOS](https://archive.ll.mit.edu/ideval/data/2000/LLS_DDOS_1.0.html),这是一个DDOS攻击的数据集,它将攻击分为五个阶段[1]:
(1) 预探测网络(IPSweep);
IPsweep of the AFB from a remote site
The adversary performs a scripted IPsweep of multiple class C subnets on the Air Force Base. The following networks are swept from address 1 to 254: 172.16.115.0/24, 172.16.114.0/24, 172.16.113.0/24, 172.16.112.0/24. The attacker sends ICMP echo-requests in this sweep and listens for ICMP echo-replies to determine which hosts are "up".
(2) 端口扫描,确定主机的脆弱信息(PortScan);
Probe of live IP's to look for the sadmind daemon running on Solaris hosts
The hosts discovered in the previous phase are probed to determine which hosts are running the "sadmind" remote administration tool. This tells the attacker which hosts might be vulnerable to the exploit that he/she has. Each host is probed, by the script, using the "ping" option of the sadmind exploit program, as provided on the Internet by "Cheez Whiz". The ping option makes a rpc request to the host in question, asks what TCP port number to connect to for the sadmind service, and then connects to the port number supplied to test to see if the daemon is listening.
(3) 获得管理员权限(FTPBufOverflow);
Breakins via the sadmind vulnerability, both successful and unsuccessful on those hosts
The attacker then tries to break into the hosts found to be running the sadmind service in the previous phase. The attack script attempts the sadmind Remote-to-Root exploit several times against each host, each time with different parameters. Since this is a remote buffer-overflow attack, the exploit code cannot easily determine the appropriate stack pointer value as in a local buffer-overflow. Thus the adversary must try several different stack pointer values, each of which he/she has validated to work on some test machines. There are three stack pointer values attempted on each potential victim. With each attempt, the exploit tries to execute one command, as root, on the remote system. The attacker needs to execute two commands however, one to "cat" an entry onto the victim's /etc/passwd file and one to "cat" an entry onto the victim's /etc/shadow file. The new root user's name is 'hacker2' and hacker2's home directory is set to be /tmp. Thus, there are 6 exploit attempts on each potential victim host. To test weather or not a break-in was successful, the attack script attempts a login, via telnet, as hacker2, after each set of two breakin attempts. When successful the attackers script moves on to the next potential victim.
(4) 安装特洛伊Mstream DDOS木马软件(UploadSoftware);
Installation of the trojan mstream DDoS software on three hosts at the AFB
Entering this phase, the attack script has built a list of those hosts on which it has successfully installed the 'hacker2' user. These are mill (172.16.115.20), pascal (172.16.112.50), and locke (172.16.112.10). For each host on this list, the script performs a telnet login, makes a directory on the victim called "/tmp/.mstream/" and uses rcp to copy the server-sol binary into the new directory. This is the mstream server software. The attacker also installs a ".rhosts" file for themselves in /tmp, so that they can rsh in to startup the binary programs. On the first victim on the list, the attacker also installs the "master-sol" software, which is the mstream master. After installing the software on each host, the attacker uses rsh to startup first the master, and then the servers. as they come up, each server "registers" with the master that it is alive. The master writes out a database of live servers to a file called "/tmp/.sr".
(5) 借助被控制的主机对远程服务器发动DDOS攻击(DDOSAttack);
Launching the DDoS
In the final phase, the attacker manually launches the DDOS. This is performed via a telnet login to the victim on which the master is running, and then, from the victim, a "telnet" to port 6723 of the localhost. Port 6723/TCP is the port on which the master listens for connections to its user-interface. After entering a password for the user-interface, the attacker is given a prompt at which he/she enters two commands. The command "servers" causes the UI to list the mstream servers which have registered with it and are ready to attack. the command "mstream 131.84.1.31 5" causes a DDOS attack, of 5 second duration, against the given IP address to be launched by all three servers simultaneously. The mstream DDOS consists of many, many connection requests to a variety of ports on the victim. All packets have a spoofed, random source IP address. The attacker then logs out. The tiny duration was chosen so that it would be possible to easily distribute tcpdump and audit logs of these events -- to avoid them being to large. In real life, one might expect a DDOS of longer duration, several hours or more.
In the case of this scenario, however, it should be noted that the DDoS does not exactly succeed. The Mstream DDoS software attempts to flood the victim with ack packets that go to many random tcp ports on the victim host. The AirForce base firewall, the Sidewinder firewall, is not configured to pass traffic on all these ports, thus the only mstream packets that make it though the firewall are those on well-known ports. All other mstream packets result in a tcp reset being sent to the spoof source address. Thus in the DMZ dump file, one sees many resets apparently coming from "www.af.mil" going to the many spoofed source addresses. These are actually created by the firewall as a result of the reciept of the tcp packet for which the firewall is configured not to proxy!
[1] 胡倩.基于多步攻击场景的攻击预测方法[J].计算机科学,2019,46(S1):365-369.
## 方法探索、文献阅读
文献1[2]
基于关联分析和HMM的网络安全态势评估模型
态势要素提取、态势理解和态势评估,是一个将基本的关于网络信息系统与网络安全方法的静动态信息通过信息融合技术逐步加工生成网络管理员可以理解和进行决策的信息的过�