LDAGibbsSampling
================
/**
Copyright (C) 2013 by
SMU Text Mining Group/Singapore Management University/Peking University
LDAGibbsSampling is distributed for research purpose, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Feel free to contact the following people if you find any
problems in the package.
lyang@cs.umass.edu * */
Brief Introduction
===================
1. This is Liu Yang's implementation for Gibbs Sampling of LDA. The test data set is Newsgroup-18828, which is included in the project. You can test other data sets with it. Just import the project into Eclipse and run LdaGibbsSampling.java to start it without any configuration. The original documents and sample output files have been included.
2. Author's technical blog : http://blog.csdn.net/yangliuy
Author's homepage:https://people.cs.umass.edu/~lyang
For more information of LDA and Gibbs Sampling: http://blog.csdn.net/yangliuy/article/details/8302599
3. This is a initial implementation for the Topic Expertise Model which is proposed in the following paper:
Liu Yang, Minghui Qiu, Swapna Gottipati, Feida Zhu, Jing Jiang, Huiping Sun and Zhong Chen. CQARank: Jointly Model Topics and Expertise in Community Question Answering. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM 2013). (http://dl.acm.org/citation.cfm?id=2505720)
If you use this model implementation, please cite this paper.
4. We will also release more open source code for topic models in https://github.com/yangliuy.
没有合适的资源?快使用搜索试试~ 我知道了~
LDA主题发现java源码
共87个文件
class:22个
txt:21个
java:15个
1星 需积分: 15 64 下载量 114 浏览量
2016-03-07
16:58:53
上传
评论
收藏 2.39MB RAR 举报
温馨提示
原作者分析的是英文,我只是在此基础上添加了IK分词,使其能够实现中文主题发现,原作者有详细介绍理论和源码(http://blog.csdn.net/yangliuy/article/details/8457329),具体相关的文档请到原版作者查阅。谢谢!
资源推荐
资源详情
资源评论
收起资源包目录
NLPLDAYL.rar (87个子文件)
NLPLDAYL
.project 384B
src
anshe
nlp
lad
segment
words
DefaultStopWordsHandler.java 2KB
ChineseTokenizer.java 3KB
lda
conf
PathConfig.java 190B
ConstantConfig.java 151B
script
DataPreprocess.java 321B
com
wordFreq.java 444B
Sorting.java 251B
Stopwords.java 17KB
ComUtil.java 15KB
MatrixUtil.java 2KB
FileUtil.java 13KB
main
LdaModel.java 8KB
LdaGibbsSampler.java 14KB
Documents.java 3KB
LdaGibbsSampling.java 3KB
.DS_Store 6KB
ik
ext.dic 44B
stopword.dic 316B
main2012.dic 2.91MB
quantifier.dic 2KB
IKAnalyzer.cfg.xml 414B
lib
ik2012lucene4.jar 47KB
SogouC.mini.20061102.tar.gz 136KB
.settings
org.eclipse.jdt.core.prefs 598B
org.eclipse.core.resources.prefs 84B
README.md 2KB
data
LdaResults
lda_90_phi.txt 408KB
lda_90_twords.txt 6KB
lda_100_theta.txt 2KB
lda_100_twords.txt 6KB
lda_90_theta.txt 2KB
lda_90_params.txt 115B
lda_100_tassign.txt 17KB
lda_100_phi.txt 408KB
lda_100_params.txt 115B
lda_90_tassign.txt 17KB
LdaParameter
LdaParameters.txt 74B
LdaOriginalDocs
15.txt 2KB
11.txt 608B
16.txt 4KB
14.txt 2KB
17.txt 1KB
18.txt 2KB
19.txt 952B
12.txt 766B
13.txt 658B
10.txt 3KB
原来
52004 51KB
15177 60KB
66322 65KB
38851 60KB
53519 50KB
61435 56KB
58862 31KB
104343 15KB
54684 69KB
.classpath 457B
.gitignore 45B
bin
anshe
nlp
lad
segment
words
ChineseTokenizer.class 3KB
DefaultStopWordsHandler.class 2KB
lda
conf
PathConfig.class 502B
ConstantConfig.class 467B
script
DataPreprocess.class 309B
com
Stopwords.class 14KB
Sorting.class 885B
ComUtil.class 15KB
FileUtil$1.class 818B
wordFreq.class 932B
FileUtil.class 15KB
MatrixUtil.class 3KB
FileUtil$2.class 600B
ComUtil$1.class 961B
main
LdaModel.class 7KB
LdaGibbsSampling.class 4KB
Documents.class 1KB
LdaGibbsSampler.class 6KB
LdaGibbsSampling$modelparameters.class 690B
LdaGibbsSampling$parameters.class 1KB
Documents$Document.class 3KB
LdaModel$TwordsComparable.class 1KB
.DS_Store 6KB
ik
ext.dic 44B
stopword.dic 316B
main2012.dic 2.91MB
quantifier.dic 2KB
IKAnalyzer.cfg.xml 414B
共 87 条
- 1
资源评论
- 艾贝儿2017-03-18花了积分,却怎么也下载不下来anshe802020-07-27换个浏览器就可以了φ(>ω<*)
anshe80
- 粉丝: 1
- 资源: 2
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功