Deep Learning 实战之 word2vec
邓澍军、陆光明、夏龙
网易有道
2014.02.27
目录
一、什么是 word2vec? .......................................................................................................... 2
二、快速入门 ........................................................................................................................... 3
三、作者八卦 ........................................................................................................................... 4
四、背景知识 ........................................................................................................................... 5
4.1 词向量 ........................................................................................................................ 5
4.2 统计语言模型 ............................................................................................................ 5
4.3 NNLM .......................................................................................................................... 7
4.4 其他 NNLM ................................................................................................................ 9
4.5 Log-Linear 模型 ........................................................................................................... 9
4.6 Log-Bilinear 模型 ...................................................................................................... 10
4.6 层次化 Log-Bilinear 模型......................................................................................... 10
五、模型 ................................................................................................................................. 11
5.1 CBOW ........................................................................................................................ 11
5.2 Skip-Gram .................................................................................................................. 13
5.3 为什么要使用 Hierarchical Softmax 或 Negative Sampling .................................... 16
六、Tricks................................................................................................................................ 17
6.1 指数运算 .................................................................................................................. 17
6.2 按 word 分布随机抽样 ........................................................................................... 18
6.3 哈希编码 .................................................................................................................. 20
6.4 随机数 ...................................................................................................................... 20
6.5 回车符 ...................................................................................................................... 20
6.6 高频词亚采样 .......................................................................................................... 21
七、分布式实现 ..................................................................................................................... 21
八、总结 ................................................................................................................................. 22
参考代码 ................................................................................................................................. 22
参考文献 ................................................................................................................................. 23