simhash-demo
A simple demonstrate of simhash algorithm, use jieba to divide the Chinese sentence into words.
###jieba
“结巴”中文分词:做最好的 Python 中文分词组件
"Jieba" (Chinese for "to stutter") Chinese text segmentation: built to be the best Python Chinese word segmentation module.