![](https://csdnimg.cn/release/download_crawler_static/10706459/bg1.jpg)
2018 年 6 月 Journal on Communications June 2018
2018104-1
第 39 卷第 6 期 通 信 学 报 Vol.39
No.6
基于二级索引结构的图压缩算法
李高超
1,2
,李犇
3
,卢毓海
4,5
,刘梦雅
6
,刘燕兵
4,5
(1. 中国科学院大学网络空间安全学院,北京 100049;2. 国家计算机网络与信息安全管理中心,北京 100029;
3. 北京市公安局朝阳分局,北京 100029;4. 中国科学院信息工程研究所,北京 100093;
5. 信息内容安全技术国家工程实验室,北京 100093;6. 英国南安普顿大学,南安普顿 SO171BJ)
摘 要:目前,各领域对图数据的分析、应用需求日益增加,且对结构复杂、耦合度高的大规模图数据的管理面
临着速度低下和空间开销大的双重挑战。面对图数据管理中查询耗时高和空间占比大的难题,提出一种图数据二
级索引压缩算法——GComIdx。该算法利用有序的键值(Key-Value)结构将相关节点和边尽可能地以相邻的方式
存储,并为高效的属性查询和邻居查询分别构造二级索引和 hash 节点索引。此外,为了节省存储空间,GComIdx
算法采用压缩算法来降低图数据磁盘空间占用率。实验结果表明,GComIdx 算法能够有效降低图数据计算的初始
化时间和图数据存储的磁盘空间占用,且查询时间小于通用数据库和其他 Key-Value 存储方案。
关键词:二级索引;图压缩;键值结构;属性查询;邻居查询
中图分类号:TN925
文献标识码:A
doi: 10.11959/j.issn.1000-436x.2018104
Graph compression algorithm based on a two-level index structure
LI Gaochao
1,2
, LI Ben
3
, LU Yuhai
4,5
, LIU Mengya
6
, LIU Yanbing
4,5
1. School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100049, China
2. Management Center of National Computer Network and Information Security, Beijing 100029, China
3. Chaoyang Branch of Beijing Public Security Bureau, Beijing 100029, China
4. Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China
5. National Engineering Laboratory for Information Security Technologies, Beijing 100093, China
6. University of Southampton, Southampton SO171BJ, United Kingdom
Abstract: The demand for the analysis and application of graph data in various fields is increasing day by day. The man-
agement of large-scale graph data with complicated structure and high degree of coupling faces two challenges: one is
querying speed too slow, the other is space consumption too large. Facing the problems of long query time and large
space occupation in graph data management, a two-level index compression algorithm named GComIdx for graph data
was proposed. GComIdx algorithm used the ordered Key-Value structure to store the associated nodes and edges as
closely as possible, and constructed two-level index and hash node index for efficient attribute query and neighbor query.
Furthermore, GComIdx algorithm used a graph data compressed technology to compress the graph data before it directly
stored in hard disk, which could effectively reduce the storing space consumption. The experimental results show that
GComIdx algorithm can effectively reduce the initialization time of the graph data calculation and the disk space occu-
pancy of the graph data storing, meanwhile, the query time is less than common graph databases and other Key-Value
storage solutions.
Key words: two-level index, graph compress, Key-Value structure, attribute query, neighbors query
收稿日期:2017-11-08;修回日期:2018-03-30
通信作者:卢毓海,luyuhai@iie.ac.cn
基金项目:国家重点研发计划基金资助项目(No.2016YFB0800300);中国科学院信息工程研究所基础前沿基金资助项目
(No.Y7Z0351101)
Foundation Items: The National Key R&D Program of China (No.2016YFB0800300), The Fundamental Theory and Cutting Edge
Technology Research Program of Institute of Information Engineering, CAS (No.Y7Z0351101)