Title: The core of the big data solutions -- Map.
Author: pengwenwei
address: No.17-18 of XiangGangbatang Community, Xiangtan City of Hunan Province, China.
Language: c++
Platform: Windows, linux
Technology: Perfect hash algorithm
Level: Advanced
Description: A high performance map algorithm
Section MFC c++ map stl
SubSection c++ algorithm
License: (GPLv3)
Map is widely used in c++ programs. Its performance is critical to programs' performance. Especially in big data and the scenarios which can't realize data distribution and parallel processing.
I have been working on big data analysis for many years in telecommunition and information security industry. The data analysis is so complicated that they can't work without map. Especially in information security industry, the data is much more complicated than others. For example, ip table, mac table, telephone numbers table, dns table etc.
Currently, the STL map and Google's hash map are the most popular maps. But they have some disadvantages. The STL map is based on binary chop, which causes a bad performance. Google Hash map has the best performance at present, but it has probability of collision. For big data analysis, the collision probability is unacceptable.
Now I would like to publish pwwMap. It includes three different maps for different scenarios:
1. Memory Map(memMap): It has a good access speed. But its size is limited by memory size.
2. Harddisk Map(diskMap): It utilizes hard disk to store data. So it could accept much more data than memory map.
3. Hashmap(hashMap): It has the best performance and a great lookup speed, but it doesn't have 'insert' and 'delete' functionality.
MemMap and diskMap could be converted to hashMap by function memMap2HashMap and diskMap2HashMap. According to the test result, my algorithms' collision probability is zero. About performance, memMap has a comparable performance with google, and hashMap's performance is 100 times better than Google's hashmap.
In summary, pwwhash are perfect hash algorithms with zero collision probability. You can refer to following artical to find the key index and compress algorithm theory:
http://blog.csdn.net/chixinmuzi/article/details/1727195
Source code and documents:
https://sourceforge.net/projects/pwwhashmap/files/?source=navbar
I would like to transfer my technique with my immigration to the United States as the condition.
Please do not contact me with email, qq or telephone. We can have a face to face talk.
My email is pww71@sina.com and password is 8622507. after delivery. Please make sure your email is received. Otherwise it's the national security agency to restrict me to receive your mail, you can only talk with me.
My address is as follows:
No.17-18 of XiangGangbatang Community, Xiangtan City of Hunan Province, China.
没有合适的资源?快使用搜索试试~ 我知道了~
自主研发的哈希算法,大数据工具pwwMap,重新更新 仅仅提供64位windows版本
共58个文件
h:11个
txt:8个
cpp:6个
需积分: 50 3 下载量 60 浏览量
2022-06-20
10:01:23
上传
评论
收藏 80KB 7Z 举报
温馨提示
对于c++程序来说 map的使用无处不在。影响程序性能的瓶颈也往往是map的性能。尤其在大数据情况下,以及业务关联紧密而无法实现数据分 发和并行处理的情况。map的性能就成了最关键的技术。 在电信行业和信息安全行业的工作经历,我都是和底层大数据打交道,尤其信息安全行业数据最复杂,都离不开map。 比如:ip表、mac表,电话号码表、域名解析表、身份证号码表的查询、病毒木马的特征码的云查杀等等。 stl库的map采用二分查找,性能最差。Google的哈希map性能和内存目前是最优的,但是有重复碰撞的机率。现在大数据 基本上不用有碰撞 几率的map。 现在我把pwwMap算法发布出来。大家可以测试对比发现,我的算法属于零碰撞的几率,但是性能比哈希算法还优。就是普通map的性能也和google相差无几。 程序使用我的map 最直接的效益就是 原来需要十个服务器解决的方案 现在只需要一个服务器。 下载地址:第二个实时更新代码 http://download.csdn.net/detail/pww71/9379828 http://sourceforge.net/projects/pww
资源详情
资源评论
资源推荐
收起资源包目录
pwwMap20220620.7z (58个子文件)
pwwHash
x64
Debug
pwwHashMapDLL.dll 110KB
pwwHashMapDLL.lib 25KB
Release
pwwHashMapDLL.dll 34KB
pwwHashMapDLL.lib 25KB
updateLog.txt 328B
UpgradeLog.htm 31KB
pwwHashMapDLL
pwwHashMapDLL.vcxproj.filters 1KB
pwwHashMapDLL.vcproj.XP-20130909DUTF.Administrator.user 1KB
pwwHashMapDLL.vcxproj 8KB
pwwHashMapDLL.vcproj 4KB
pwwHashMapDLL.vcxproj.user 164B
linux error.txt 214B
threadLib
ResLock.hpp 2KB
General.hpp 906B
ThreadMgr.d 0B
ResLock.d 0B
Thread.d 0B
ResLock.o 7KB
Thread.o 47KB
ThreadMgr.hpp 2KB
ThreadMgr.cpp 5KB
ThreadMgr.o 57KB
Thread.cpp 8KB
ResLock.cpp 2KB
Thread.hpp 3KB
pwwHashLib
PwwHashMem.inl 3KB
safePwwStrMap.h 4KB
PwwMap.inl 22KB
PwwDiskMap.h 6KB
safePwwMap.inl 6KB
PwwStrMap.inl 9KB
safePwwMap.h 3KB
PwwHash.h 5KB
PwwStrMap.h 3KB
safePwwStrMap.inl 10KB
PwwMap.h 4KB
PwwHashMem.h 2KB
build.txt 204B
pwwhashforlinux
linuxForRun.txt 227B
pwwhash
main.cpp 11KB
-D_DEBUG 515B
make.sh 426B
Makefile 1KB
config.txt 96B
pwwHash
stdafx.h 233B
pwwHash.vcproj 6KB
pwwHash.vcxproj.user 168B
pwwHash.vcxproj 11KB
targetver.h 498B
stdafx.cpp 212B
ReadMe.txt 1KB
HiResTimer.h 398B
HiResTimer.cpp 902B
pwwHash.vcproj.XP-20130909DUTF.Administrator.user 1KB
Common.h 555B
User's Manual.txt 3KB
README.txt 3KB
pwwHash2019.sln 2KB
共 58 条
- 1
pww71
- 粉丝: 90
- 资源: 33
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0