# cs7648-project
Repo for CS 7648: Interactive Robot Learning Final Project: Applying Active Learning for Sentiment Analysis
This repo contains an implementation for an Active Learning Pipeline that allows a Sentiment Classification Model to learn how to label tweets using a small set of labeled data.
The pipeline will query a set of unlabeled tweets to be labeled by a human annotator. We experimented with different acquisition functions that determine which tweets will help
the model learn most about the unlabeled dataset. We trained our models on the Sentiment-140 Twitter dataset which contains 1.5 million tweets with positive and negative sentiment labels. The data can be found here: https://www.kaggle.com/kazanova/sentiment140
Implemented a CNN-based Neural Network for Sentiment Classification using pytorch. The original paper for this architecture can be found here: https://arxiv.org/pdf/1408.5882.pdf
Link to Final Presentation: https://docs.google.com/presentation/d/16g83CSRMItJjRbkhx2ZDgRpDDaXu_Q9cr85_UnrvStY/edit?usp=sharing
https://docs.google.com/presentation/d/16g83CSRMItJjRbkhx2ZDgRpDDaXu_Q9cr85_UnrvStY/edit?usp=sharing
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
描述了一系列建立在word2vec之上的卷积神经网络实验。尽管对超参数进行了很少的调整,但带有一层卷积的简单CNN表现得非常好。我们的结果增加了一个已被认可的证据,即无监督词向量的预训练是NLP深度学习的一个重要组成部分。
资源推荐
资源详情
资源评论
收起资源包目录
基于CNN_Pytorch的文字情感分类 (107个子文件)
human_labeled_tweets_lc_rk.csv 20KB
metrics.csv 13KB
results_ablation.csv 5KB
human_labeled_tweets_rk.csv 4KB
performance-analysis.ipynb 47KB
pipeline.jpg 38KB
README.md 1KB
manual_roy_cnn_active_learning_validation_accuracy_least_confidence_20000_50.npy 208B
cnn_active_learning_validation_accuracy_entropy_9000_10_rk.npy 208B
cnn_active_learning_validation_accuracy_least_confidence_20000_50_rk.npy 192B
cnn_active_learning_val_accuracy_tweet_count_15000_25.npy 192B
cnn_active_learning_val_accuracy_random_score_10000_25.npy 192B
cnn_active_learning_val_accuracy_random_score_10000_10.npy 192B
cnn_active_learning_val_accuracy_entropy_score_15000_50.npy 192B
cnn_active_learning_val_accuracy_entropy_score_10000_25.npy 192B
cnn_active_learning_val_accuracy_least_confidence_10000_50.npy 192B
cnn_active_learning_val_accuracy_tweet_count_20000_25.npy 192B
cnn_active_learning_val_accuracy_entropy_score_20000_25.npy 192B
cnn_active_learning_val_accuracy_random_score_10000_50.npy 192B
cnn_active_learning_val_accuracy_tweet_count_20000_10.npy 192B
cnn_active_learning_val_accuracy_tweet_count_15000_50.npy 192B
cnn_active_learning_val_accuracy_entropy_score_5000_10.npy 192B
cnn_active_learning_val_accuracy_tweet_count_5000_10.npy 192B
cnn_active_learning_val_accuracy_random_score_20000_25.npy 192B
cnn_active_learning_val_accuracy_least_confidence_20000_10.npy 192B
cnn_active_learning_val_accuracy_least_confidence_20000_25.npy 192B
cnn_active_learning_val_accuracy_tweet_count_10000_50.npy 192B
cnn_active_learning_val_accuracy_random_score_15000_10.npy 192B
cnn_active_learning_val_accuracy_tweet_count_20000_50.npy 192B
cnn_active_learning_val_accuracy_least_confidence_15000_25.npy 192B
cnn_active_learning_val_accuracy_least_confidence_10000_10.npy 192B
cnn_active_learning_val_accuracy_entropy_score_20000_50.npy 192B
cnn_active_learning_val_accuracy_least_confidence_20000_50.npy 192B
cnn_active_learning_val_accuracy_tweet_count_15000_10.npy 192B
cnn_active_learning_val_accuracy_entropy_score_5000_50.npy 192B
cnn_active_learning_val_accuracy_least_confidence_10000_25.npy 192B
cnn_active_learning_val_accuracy_tweet_count_10000_25.npy 192B
cnn_active_learning_val_accuracy_tweet_count_5000_25.npy 192B
cnn_active_learning_val_accuracy_least_confidence_5000_25.npy 192B
cnn_active_learning_val_accuracy_entropy_score_20000_10.npy 192B
cnn_active_learning_val_accuracy_least_confidence_5000_10.npy 192B
cnn_active_learning_val_accuracy_random_score_20000_10.npy 192B
cnn_active_learning_val_accuracy_entropy_score_15000_10.npy 192B
cnn_active_learning_val_accuracy_tweet_count_10000_10.npy 192B
cnn_active_learning_val_accuracy_random_score_5000_50.npy 192B
cnn_active_learning_val_accuracy_random_score_5000_25.npy 192B
cnn_active_learning_val_accuracy_least_confidence_15000_50.npy 192B
cnn_active_learning_val_accuracy_random_score_5000_10.npy 192B
cnn_active_learning_val_accuracy_random_score_15000_50.npy 192B
cnn_active_learning_val_accuracy_entropy_score_5000_25.npy 192B
cnn_active_learning_val_accuracy_random_score_15000_25.npy 192B
cnn_active_learning_val_accuracy_entropy_score_10000_10.npy 192B
cnn_active_learning_val_accuracy_entropy_score_10000_50.npy 192B
cnn_active_learning_val_accuracy_least_confidence_15000_10.npy 192B
cnn_active_learning_val_accuracy_tweet_count_5000_50.npy 192B
cnn_active_learning_val_accuracy_entropy_score_15000_25.npy 192B
cnn_active_learning_val_accuracy_random_score_20000_50.npy 192B
cnn_active_learning_val_accuracy_least_confidence_5000_50.npy 192B
cnn_train_loss.npy 168B
cnn_validation_accuracy.npy 168B
paper.pdf 236KB
roy_results.png 76KB
active_learning_random_v_lc.png 62KB
active_learning_lc_v_baseline.png 53KB
auto_v_manual_label_lc.png 51KB
avg_results.png 51KB
active_learning_entropy_v_random.png 51KB
active_learning_fix.png 50KB
active_learning_random.png 47KB
active_learning_exp_v_baseline.png 44KB
cnn_validation_accuracy_1.png 44KB
active_learning_entropy_rk.png 44KB
cnn_active_learning_lc_bert_9100.png 42KB
cnn_validation_accuracy_bert.png 39KB
active_learning_no_shuffle.png 34KB
active_learning_lc_auto_label.png 34KB
cnn_loss_active_learning_least_confidence_9000_sample_10.png 34KB
cnn_active_learning_least_confidence_9100.png 32KB
cnn_validation_accuracy.png 32KB
cnn_validation_active_learning_lc20000.png 31KB
active_learning_lc_20000_50_rk.png 31KB
cnn_validation_active_learning_lc9000.png 31KB
cnn_validation_active_learning_lc10000_bert.png 31KB
cnn_baseline_no_bert.png 29KB
active_learning_entropy.png 29KB
lstm_bert_validation_accuracy.png 29KB
cnn_supervised_accuracy_9100.png 29KB
cnn_train_loss_bert.png 28KB
cnn_train_loss_1.png 28KB
cnn_loss_active_learning_lc20000.png 27KB
cnn_train_loss.png 26KB
cnn_loss_active_learning_lc9000.png 24KB
lstm_bert_train_loss.png 24KB
cnn.pth 65.78MB
active_learning.py 16KB
runexperiments.py 14KB
load_data.py 11KB
train.py 6KB
plot_results.py 4KB
lightning_main.py 3KB
共 107 条
- 1
- 2
资源评论
- 爱吃热干面的热心市民2022-07-04资源很好用,有较大的参考价值,资源不错,支持一下。
大大U
- 粉丝: 742
- 资源: 136
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 第三章 循环与控制流程的学习笔记
- 程序员常用Linux命令
- 基于双路神经网络的滚动轴承故障诊断 融合了原始振动信号 和 二维信号时频图像 的多输入(多通道)故障诊断方法 单路和双
- 从头写CAD完成部分源码
- PLC交通灯控制,博途V15,S7-1200 使用比较指令,程序完整,触摸屏调试正常,触摸屏上有倒计时显示功能 有两份对应实训
- WOA-Catboost鲸鱼算法优化Catboost分类预测,优化前后对比(Matlab完整源码和数据)
- Abaqus一层一跨混凝土框架拟静力试验模拟详细建模过程 Abaqus梁单元+两种子程序 1、Abaqus梁单元+子程序(PQF
- 基于yolov8的人脸检测计数系统python源码+onnx模型+评估指标曲线+精美GUI界面.zip
- PSO-Catboost粒子群算法优化Catboost分类预测,优化前后对比(Matlab完整源码和数据)
- 元旦倒计时代码,动态网页基础
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功