IMDbDataset-迁移学习_如何衡量二维样本的分布差异性资源-CSDN文库

共11个文件

pkl：4个

py：4个

ipynb：1个

版权申诉

迁移学习

自然语言处理

人工智能

机器学习

nlp

5星 · 超过95%的资源 99 浏览量 2022-03-23 11:21:01 上传评论收藏 98KB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

Exploring-the-potential-of-Transfer-Learning-master.zip （11个子文件）

Exploring-the-potential-of-Transfer-Learning-master

word_index_create.py 1KB

lstm_training.py 8KB

bert_training.py 6KB

create_data.py 856B

visualization.ipynb 104KB

results

batch_train_acc_lstm.pkl 232KB

batch_train_acc_bert.pkl 58KB

batch_dev_acc_lstm.pkl 29KB

batch_dev_acc_bert.pkl 7KB

requirements.txt 658B

README.md 1KB

# Exploring-the-potential-of-Transfer-Learning ## Project Details The aim of the project is to show how can be exploited the potential of Transfer Learning in the field of NLP.\ The particular Use Case is a Sentiment Analysis project, with the famous IMDB Dataset. You can easily load it from the Kaggle website at [this](https://www.kaggle.com/datasets/ashirwadsangwan/imdb-dataset).\ The project is organized as follows:\ 1.\ - in the file *create_data.py* it has been loaded and saved the word index using the **GLOVE Embeddings**;\ - the subsequent step is to use these pretrained word embedding in order to set up a a Bidirectional LSTM in the *lstm_training.py* file.\ 2. \ - in the file *word_index_create.py*, instead, it has been exploited the tokenizer of *BERT* in order to create the word index for the subsequent evaluation;\ - in the file *bert_training.py* it has been showed how it can be exploited the representation of the last layer of BERT, in order to to build up on top of that a series of custom layers. In this case, the gradient of BERT parameters has been freezed in order to not update those in the Backpropagation. Lastly, the notebook *visualization.ipynb* shows up the core metrics during the learning process for both models. We can see that with GLOVE+LSTM we were able to achieve 85% accuracy on validation set, while with BERT+LSTM combo, we have managed to achieve a 91% accuracy on validation set.

评论收藏

内容反馈

版权申诉