This file goes together with the main dataset file: stanfordSentimentTreebank.zip
Here are the files containing all the raw scores and vocab. Some of the vocab may not be used in the final dataset.
This is due to changes in the data acquisition during our earlier iterations.
A raw score is an integer between [1, 25] with 1 being the most negative and 25 most positive.
rawscores_exp12.txt contains an index followed by 3-6 raw scores (most will only have 3, and only a handful have 4-6). Integers are separated by a comma.
sentlex_exp12.txt contains an index, followed by a comma, and the phrase corresponding to the index.
Please note that some of the symbols, such as commas, have been converted to a HTML tags.
The final processed dataset averaged these responses and mapped them to be between [0,1] and then mapped those to the 5 classes using the following thresholds:
[0, 0.2], (0.2, 0.4], (0.4, 0.6], (0.6, 0.8], (0.8, 1.0]
For more questions, please see the the main dataset file README.txt. Feel free to ask questions on the website:
http://nlp.stanford.edu/sentiment/
oydxxynu
- 粉丝: 2
- 资源: 2
最新资源
- 基于WEB的高校学生实习实训管理信息系统全部资料+详细文档.zip
- 基于web的高校学生成绩管理系统全部资料+详细文档.zip
- 基于人脸识别的高校迎新管理系统全部资料+详细文档.zip
- 基于WIFI的Android高校签到app全部资料+详细文档.zip
- 基于wifi和人脸比对的高校课堂手机考勤程序全部资料+详细文档.zip
- 基于遗传算法的高校自动排课系统全部资料+详细文档.zip
- 基于网络舆情的高校学生社会心理态势感知系统全部资料+详细文档.zip
- 基于微信小程序和人脸识别技术的高校查寝系统全部资料+详细文档.zip
- S7-1200-Modnus RTU通信主站结构块程序 TIA博图SCL源码语言编程.程序可用于西门子S7-1200PLC.S7-1500PLC.Modnus RTU通信 简单实用,轻松实现对30个
- 人工智能实战-从 Python 入门到机器学习.zip
- 基于双路神经网络的滚动轴承故障诊断 融合了原始振动信号 和 二维信号时频图像 的多输入(多通道)故障诊断方法 单路和双路都可 时频图像算法可选小波变,短时傅里叶变,马尔可夫变迁场,格拉姆角场
- C#运动控制系统源码 雷赛运动控制卡控制系统 像高川控制卡、高川控制器、或者固高运动控制卡以及正运动控制器、正运动控制卡可以用这个框架,自己替一下库文件等代码就可以 功能丰富,注释多,非常适合新
- 模具试题.doc
- 机加工工艺试题答案.doc
- 机械、电器试题答案.doc
- 技术测评试题.doc
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈