基于深度学习的自然语言处理课程.zip资源-CSDN文库

共515个文件

class：154个

py：111个

txt：81个

版权申诉

14 浏览量 2024-05-09 11:37:03 上传评论收藏 96.24MB ZIP 举报

基于深度学习的自然语言处理课程深度学习（Deep Learning，简称DL）是机器学习（Machine Learning，简称ML）领域中一个新的研究方向，其目标是让机器能够像人一样具有分析学习能力，识别文字、图像和声音等数据。深度学习通过学习样本数据的内在规律和表示层次，使机器能够模仿视听和思考等人类活动，从而解决复杂的模式识别难题。深度学习的核心是神经网络，它由若干个层次构成，每个层次包含若干个神经元。神经元接收上一层次神经元的输出作为输入，通过加权和转换后输出到下一层次神经元，最终生成模型的输出结果。神经网络之间的权值和偏置是神经网络的参数，决定了输入值和输出值之间的关系。深度学习的训练过程通常涉及反向传播算法，该算法用于优化网络参数，使神经网络能够更好地适应数据。训练数据被输入到神经网络中，通过前向传播算法将数据从输入层传递到输出层，然后计算网络输出结果与实际标签之间的差异，即损失函数。通过反向传播算法，网络参数会被调整以减小损失函数值，直到误差达到一定的阈值为止。深度学习中还包含两种主要的神经网络类型：卷积神经网络（Convolutional Neural Networks，简称CNN）和循环神经网络（Recurrent Neural Networks，简称RNN）。卷积神经网络特别擅长处理图像数据，通过逐层卷积和池化操作，逐步提取图像中的高级特征。循环神经网络则适用于处理序列数据，如文本或时间序列数据，通过捕捉序列中的依赖关系来生成模型输出。深度学习在许多领域都取得了显著的成果，包括计算机视觉及图像识别、自然语言处理、语音识别及生成、推荐系统、游戏开发、医学影像识别、金融风控、智能制造、购物领域、基因组学等。随着技术的不断发展，深度学习将在更多领域展现出其潜力。在未来，深度学习可能会面临一些研究热点和挑战，如自监督学习、小样本学习、联邦学习、自动机器学习、多模态学习、自适应学习、量子机器学习等。这些研究方向将推动深度学习技术的进一步发展和应用。

资源推荐

资源详情

资源评论

收起资源包目录

基于深度学习的自然语言处理课程.zip （515个子文件）

acl.bst 25KB

cw.c 36KB

lbl.c 32KB

word2vec.c 32KB

nnlm.c 32KB

cooccur.c 19KB

glove.c 17KB

word2phrase.c 9KB

shuffle.c 8KB

vocab_count.c 7KB

compute-accuracy.c 5KB

compute-accuracy-txt.c 5KB

word-analogy.c 5KB

distance.c 5KB

fasttext.cc 15KB

model.cc 10KB

dictionary.cc 10KB

args.cc 8KB

main.cc 5KB

matrix.cc 2KB

vector.cc 2KB

utils.cc 594B

FastLineReader.class 26KB

FastWordReader.class 25KB

FastLineBinaryReader.class 25KB

MultiSenseWordEmbeddingModel.class 24KB

WordEmbeddingModel.class 15KB

VocabBuilder.class 14KB

MultiSenseEmbeddingBrowse$.class 13KB

EmbeddingOpts.class 8KB

MultiSenseSkipGramEmbeddingModel.class 8KB

EmbeddingDistance$.class 8KB

MultiSenseEmbeddingBrowse.class 5KB

SkipGramNegSamplingEmbeddingModel.class 5KB

HogWildTrainer.class 4KB

CBOWNegSamplingEmbeddingModel.class 4KB

MultiSenseEmbeddingBrowse$$anon$1.class 4KB

EmbeddingDistance$$anon$1.class 4KB

MultiSenseWordEmbeddingModel$$anonfun$store$1$$anonfun$apply$mcVI$sp$3.class 4KB

CBOWNegSamplingExample.class 3KB

MultiSenseEmbeddingBrowse$$anonfun$play$1$$anonfun$apply$mcVI$sp$4.class 3KB

MultiSenseWordEmbeddingModel$$anonfun$load_weights$1.class 3KB

MultiSenseEmbeddingBrowse$$anonfun$playMS$1.class 3KB

MultiSenseSkipGramEmbeddingModel$$anonfun$process$2.class 3KB

MultiSenseEmbeddingBrowse$$anonfun$play$1.class 3KB

EmbeddingDistance.class 3KB

EmbeddingDistance$$anonfun$play$2.class 3KB

MultiSenseEmbeddingBrowse$$anonfun$play$1$$anonfun$apply$mcVI$sp$4$$anonfun$apply$mcVI$sp$5.class 3KB

MultiSenseEmbeddingBrowse$$anonfun$load$1$$anonfun$apply$mcVI$sp$2.class 3KB

CBOWNegSamplingEmbeddingModel$$anonfun$process$2.class 3KB

MultiSenseEmbeddingBrowse$$anonfun$MultiSenseEmbeddingBrowse$$knn$1$$anonfun$apply$mcVI$sp$6.class 3KB

MSCBOWSkipGramNegSamplingExample.class 3KB

MultiSenseEmbeddingBrowse$$anonfun$load$1.class 3KB

MultiSenseSkipGramEmbeddingModel$$anonfun$process$2$$anonfun$apply$mcVI$sp$1.class 3KB

SkipGramNegSamplingEmbeddingModel$$anonfun$process$2.class 3KB

MultiSenseWordEmbeddingModel$$anonfun$store$1.class 3KB

WordVec$.class 3KB

SkipGramNegSamplingExample.class 3KB

SkipGramNegSamplingEmbeddingModel$$anonfun$process$2$$anonfun$apply$mcVI$sp$1.class 3KB

EmbeddingDistance$$anonfun$load$1.class 3KB

preprocess$$anonfun$main$1.class 2KB

preprocess$.class 2KB

WordEmbeddingModel$$anonfun$store$1.class 2KB

MultiSenseSkipGramEmbeddingModel$$anonfun$cbow_predict_dpmeans$2.class 2KB

MultiSenseEmbeddingBrowse$$anonfun$MultiSenseEmbeddingBrowse$$getSense$2.class 2KB

MultiSenseWordEmbeddingModel$$anonfun$learnEmbeddings$3.class 2KB

MultiSenseSkipGramEmbeddingModel$$anonfun$process$2$$anonfun$apply$mcVI$sp$1$$anonfun$apply$mcVI$sp$2.class 2KB

MultiSenseSkipGramEmbeddingModel$$anonfun$cbow_predict_kmeans$2.class 2KB

MultiSenseWordEmbeddingModel$$anonfun$store$1$$anonfun$apply$mcVI$sp$3$$anonfun$apply$mcVI$sp$5.class 2KB

MultiSenseEmbeddingBrowse$$anonfun$displayKNN$3.class 2KB

MultiSenseSkipGramEmbeddingModel$$anonfun$cbow_predict$2.class 2KB

MultiSenseSkipGramEmbeddingModel$$anonfun$process$2$$anonfun$apply$mcVI$sp$3.class 2KB

SkipGramNegSamplingEmbeddingModel$$anonfun$process$2$$anonfun$apply$mcVI$sp$3.class 2KB

CBOWNegSamplingEmbeddingModel$$anonfun$process$2$$anonfun$apply$mcVI$sp$1.class 2KB

SkipGramNegSamplingEmbeddingModel$$anonfun$process$2$$anonfun$apply$mcVI$sp$1$$anonfun$apply$mcVI$sp$2.class 2KB

MultiSenseWordEmbeddingModel$$anonfun$learnEmbeddings$1.class 2KB

MultiSenseWordEmbeddingModel$$anonfun$store$1$$anonfun$apply$mcVI$sp$3$$anonfun$apply$mcVI$sp$4.class 2KB

CBOWNegSamplingEmbeddingModel$$anonfun$process$2$$anonfun$apply$mcVI$sp$2.class 2KB

VocabBuilder$$anonfun$loadVocab$1.class 2KB

MultiSenseWordEmbeddingModel$$anonfun$store$1$$anonfun$apply$mcVI$sp$3$$anonfun$apply$mcVI$sp$6.class 2KB

MultiSenseEmbeddingBrowse$$anonfun$load$1$$anonfun$apply$mcVI$sp$2$$anonfun$apply$mcVI$sp$3.class 2KB

VocabBuilder$$anonfun$buildSamplingTable$2.class 2KB

CBOWNegSamplingExample$$anonfun$accumulateValueAndGradient$2.class 2KB

VocabBuilder$$anonfun$saveVocab$1.class 2KB

MultiSenseEmbeddingBrowse$$anonfun$play$1$$anonfun$apply$mcVI$sp$7.class 2KB

MultiSenseEmbeddingBrowse$$anonfun$MultiSenseEmbeddingBrowse$$knn$1.class 2KB

BrowseOptions.class 2KB

TensorUtils$.class 2KB

MultiSenseWordEmbeddingModel$$anonfun$learnEmbeddings$3$$anonfun$apply$1.class 2KB

vocab_word.class 2KB

MultiSenseWordEmbeddingModel$$anonfun$learnEmbeddings$1$$anonfun$apply$mcVI$sp$1.class 2KB

MultiSenseWordEmbeddingModel$$anonfun$store$1$$anonfun$apply$mcVI$sp$2.class 2KB

CBOWNegSamplingExample$$anonfun$accumulateValueAndGradient$1.class 2KB

MultiSenseWordEmbeddingModel$$anonfun$learnEmbeddings$3$$anonfun$apply$1$$anonfun$apply$2.class 2KB

MultiSenseEmbeddingBrowse$$anonfun$displayKNN$4.class 2KB

MultiSenseWordEmbeddingModel$$anonfun$buildVocab$4.class 2KB

MultiSenseEmbeddingBrowse$$anonfun$playMS$2.class 2KB

MultiSenseSkipGramEmbeddingModel$$anonfun$cbow_predict_dpmeans$1.class 2KB

MultiSenseSkipGramEmbeddingModel$$anonfun$cbow_predict_kmeans$1.class 2KB

WordEmbeddingModel$$anonfun$store$1$$anonfun$apply$mcVI$sp$1.class 2KB

共 515 条

Text Classification ------------------------------------------------------------------------- the purpose of this repository is to explore text classification methods in NLP with deep learning. UPDATE: if you want to try a model now, you can go to folder 'a02_TextCNN', run 'python -u p7_TextCNN_train.py', it will use sample data to train a model, and print loss and F1 score periodically. it has all kinds of baseline models for text classificaiton. it also support for multi-label classification where multi label associate with an sentence or document. although many of these models are simple, and may not get you to top level of the task.but some of these models are very classic, so they may be good to serve as baseline models. each model has a test function under model class. you can run it to performance toy task first. the model is indenpendent from dataset. <a href='https://github.com/brightmart/text_classification/blob/master/multi-label-classification.pdf'>check here for formal report of large scale multi-label text classification with deep learning</a> serveral modes here can also be used for modelling question answering (with or without context), or to do sequences generating. we explore two seq2seq model(seq2seq with attention,transformer-attention is all you need) to do text classification. and these two models can also be used for sequences generating and other tasks. if you task is a multi-label classification, you can cast the problem to sequences generating. we implement two memory network. one is dynamic memory network. previously it reached state of art in question answering, sentiment analysis and sequence generating tasks. it is so called one model to do serveral different tasks, and reach high performance. it has four modules. the key component is episodic memory module. it use gate mechanism to performance attention, and use gated-gru to update episode memory, then it has another gru( in a vertical direction) to pefromance hidden state update. it has ability to do transitive inference. the second memory network we implemented is recurrent entity network: tracking state of the world. it has blocks of key-value pairs as memory, run in parallel, which achieve new state of art. it can be used for modelling question answering with contexts(or history). for example, you can let the model to read some sentences(as context), and ask a question(as query), then ask the model to predict an answer; if you feed story same as query, then it can do classification task. if you need some sample data and word embedding pertrained on word2vec, you can find it in closed issues, such as:<a href="https://github.com/brightmart/text_classification/issues/3">issue 3</a>. you can also find some sample data at folder "data". it contains two files:'sample_single_label.txt', contains 50k data with single label; 'sample_multiple_label.txt', contains 20k data with multiple labels. input and label of is separate by " __label__". if you want to know more detail about dataset of text classification or task these models can be used, one of choose is below: https://biendata.com/competition/zhihu/ Models: ------------------------------------------------------------------------- 1) fastText 2) TextCNN 3) TextRNN 4) RCNN 5) Hierarchical Attention Network 6) seq2seq with attention 7) Transformer("Attend Is All You Need") 8) Dynamic Memory Network 9) EntityNetwork:tracking state of the world 10) Ensemble models 11) Boosting: for a single model, stack identical models together. each layer is a model. the result will be based on logits added together. the only connection between layers are label's weights. the front layer's prediction error rate of each label will become weight for the next layers. those labels with high error rate will have big weight. so later layer's will pay more attention to those mis-predicted labels, and try to fix previous mistake of former layer. as a result, we will get a much strong model. check a00_boosting/boosting.py and other models: 1) BiLstmTextRelation; 2) twoCNNTextRelation; 3) BiLstmTextRelationTwoRNN Performance ------------------------------------------------------------------------- (mulit-label label prediction task,ask to prediction top5, 3 million training data,full score:0.5) Model | fastText|TextCNN|TextRNN| RCNN | HierAtteNet|Seq2seqAttn|EntityNet|DynamicMemory|Transformer --- | --- | --- | --- |--- |--- |--- |--- |--- |---- Score | 0.362 | 0.405| 0.358 | 0.395| 0.398 |0.322 |0.400 |0.392 |0.322 Training| 10m | 2h |10h | 2h | 2h |3h |3h |5h |7h -------------------------------------------------------------------------------------------------- Ensemble of TextCNN,EntityNet,DynamicMemory: 0.411 Ensemble EntityNet,DynamicMemory: 0.403 -------------------------------------------------------------------------------------------------- Notice: `m` stand for **minutes**; `h` stand for **hours**; `HierAtteNet` means Hierarchical Attention Networkk; `Seq2seqAttn` means Seq2seq with attention; `DynamicMemory` means DynamicMemoryNetwork; `Transformer` stand for model from 'Attention Is All You Need'. Useage: ------------------------------------------------------------------------------------------------------- 1) model is in `xxx_model.py` 2) run python `xxx_train.py` to train the model 3) run python `xxx_predict.py` to do inference(test). Each model has a test method under the model class. you can run the test method first to check whether the model can work properly. ------------------------------------------------------------------------- Environment: ------------------------------------------------------------------------------------------------------- python 2.7+ tensorflow 1.1 (tensorflow 1.2,1.3,1.4 also works; most of models should also work fine in other tensorflow version, since we use very few features bond to certain version; if you use python 3.5, it will be fine as long as you change print/try catch function) TextCNN model is already transfomed to python 3.6 ------------------------------------------------------------------------- Notice: ------------------------------------------------------------------------------------------------------- Some util function is in data_util.py; typical input like: "x1 x2 x3 x4 x5 __label__ 323434" where 'x1,x2' is words, '323434' is label; it has a function to load and assign pretrained word embedding to the model,where word embedding is pretrained in word2vec or fastText. Models Detail: ------------------------------------------------------------------------- 1.fastText: ------------- implmentation of <a href="https://arxiv.org/abs/1607.01759">Bag of Tricks for Efficient Text Classification</a> after embed each word in the sentence, this word representations are then averaged into a text representation, which is in turn fed to a linear classifier.it use softmax function to compute the probability distribution over the predefined classes. then cross entropy is used to compute loss. bag of word representation does not consider word order. in order to take account of word order, n-gram features is used to capture some partial information about the local word order; when the number of classes is large, computing the linear classifier is computational expensive. so it usehierarchical softmax to speed training process. 1) use bi-gram and/or tri-gram 2) use NCE loss to speed us softmax computation(not use hierarchy softmax as original paper) result: performance is as good as paper, speed also very fast. check: p5_fastTextB_model.py ![alt text](https://github.com/brightmart/text_classification/blob/master/images/fastText.JPG) ------------------------------------------------------------------------- 2.TextCNN: ------------- Implementation of <a href="http://www.aclweb.org/anthology/D14-1181"> Co

评论收藏

内容反馈

版权申诉