基于深度学习的命名实体识别.zip资源-CSDN文库

共1191个文件

py：547个

pyc：540个

txt：10个

需积分: 5 136 浏览量 2024-05-10 10:39:24 上传评论收藏 111.46MB ZIP 举报

基于深度学习的命名实体识别深度学习（Deep Learning，简称DL）是机器学习（Machine Learning，简称ML）领域中一个新的研究方向，其目标是让机器能够像人一样具有分析学习能力，识别文字、图像和声音等数据。深度学习通过学习样本数据的内在规律和表示层次，使机器能够模仿视听和思考等人类活动，从而解决复杂的模式识别难题。深度学习的核心是神经网络，它由若干个层次构成，每个层次包含若干个神经元。神经元接收上一层次神经元的输出作为输入，通过加权和转换后输出到下一层次神经元，最终生成模型的输出结果。神经网络之间的权值和偏置是神经网络的参数，决定了输入值和输出值之间的关系。深度学习的训练过程通常涉及反向传播算法，该算法用于优化网络参数，使神经网络能够更好地适应数据。训练数据被输入到神经网络中，通过前向传播算法将数据从输入层传递到输出层，然后计算网络输出结果与实际标签之间的差异，即损失函数。通过反向传播算法，网络参数会被调整以减小损失函数值，直到误差达到一定的阈值为止。深度学习中还包含两种主要的神经网络类型：卷积神经网络（Convolutional Neural Networks，简称CNN）和循环神经网络（Recurrent Neural Networks，简称RNN）。卷积神经网络特别擅长处理图像数据，通过逐层卷积和池化操作，逐步提取图像中的高级特征。循环神经网络则适用于处理序列数据，如文本或时间序列数据，通过捕捉序列中的依赖关系来生成模型输出。深度学习在许多领域都取得了显著的成果，包括计算机视觉及图像识别、自然语言处理、语音识别及生成、推荐系统、游戏开发、医学影像识别、金融风控、智能制造、购物领域、基因组学等。随着技术的不断发展，深度学习将在更多领域展现出其潜力。在未来，深度学习可能会面临一些研究热点和挑战，如自监督学习、小样本学习、联邦学习、自动机器学习、多模态学习、自适应学习、量子机器学习等。这些研究方向将推动深度学习技术的进一步发展和应用。

资源推荐

资源详情

资源评论

收起资源包目录

基于深度学习的命名实体识别.zip （1191个子文件）

pip3.6 266B

python3.6 24B

activate 2KB

sysconfig.cfg 3KB

pyvenv.cfg 75B

checkpoint 585B

checkpoint 91B

config_file 259B

conlleval 12KB

activate.csh 1KB

ner.ckpt-744.data-00000-of-00001 19.47MB

ner.ckpt-768.data-00000-of-00001 19.47MB

ner.ckpt-1032.data-00000-of-00001 19.47MB

ner.ckpt-984.data-00000-of-00001 19.47MB

ner.ckpt-816.data-00000-of-00001 19.47MB

albert_model.ckpt.data-00000-of-00001 16.38MB

time.dev 134KB

.DS_Store 14KB

.DS_Store 10KB

t64.exe 104KB

w64.exe 98KB

t32.exe 95KB

w32.exe 88KB

gui-64.exe 74KB

cli-64.exe 73KB

cli-32.exe 64KB

cli.exe 64KB

gui-32.exe 64KB

gui.exe 64KB

activate.fish 2KB

.gitignore 176B

DiseaseNer.iml 559B

ner.ckpt-744.index 4KB

ner.ckpt-984.index 4KB

ner.ckpt-1032.index 4KB

ner.ckpt-816.index 4KB

ner.ckpt-768.index 4KB

albert_model.ckpt.index 1KB

INSTALLER 4B

albert_large_zh_parameters.jpg 211KB

state_of_the_art.jpg 118KB

albert_performance.jpg 118KB

add_data_removing_dropout.jpg 96KB

albert_configuration.jpg 90KB

xlarge_loss.jpg 81KB

albert_tiny_compare_s_old.jpg 63KB

crmc2018_compare_s.jpg 62KB

albert_tiny_compare_s.jpg 47KB

albert_config_xxlarge.json 564B

albert_config_xlarge.json 563B

albert_config_large.json 563B

albert_config_base.json 563B

albert_config_tiny.json 562B

bert_config.json 518B

LICENSE 1KB

train.log 159KB

README.md 19KB

README.md 668B

ner.ckpt-1032.meta 1.29MB

ner.ckpt-984.meta 1.29MB

ner.ckpt-816.meta 1.29MB

ner.ckpt-768.meta 1.29MB

ner.ckpt-744.meta 1.29MB

albert_model.ckpt.meta 184KB

METADATA 5KB

METADATA 4KB

cacert.pem 258KB

pip 266B

pip3 266B

maps.pkl 332B

distutils-precedence.pth 152B

pyparsing.py 267KB

pyparsing.py 227KB

uts46data.py 192KB

langrussianmodel.py 128KB

html5parser.py 114KB

__init__.py 106KB

langbulgarianmodel.py 103KB

langthaimodel.py 101KB

langhungarianmodel.py 100KB

langgreekmodel.py 97KB

langhebrewmodel.py 96KB

langturkishmodel.py 94KB

tarfile.py 90KB

easy_install.py 83KB

constants.py 82KB

_tokenizer.py 75KB

util.py 58KB

locators.py 51KB

msvc.py 50KB

database.py 50KB

modeling.py 49KB

dist.py 49KB

ccompiler.py 46KB

create_pretraining_data.py 43KB

distro.py 43KB

共 1191 条

# albert_zh An Implementation of <a href="https://arxiv.org/pdf/1909.11942.pdf">A Lite Bert For Self-Supervised Learning Language Representations</a> with TensorFlow ALBert is based on Bert, but with some improvements. It achieves state of the art performance on main benchmarks with 30% parameters less. For albert_base_zh it only has ten percentage parameters compare of original bert model, and main accuracy is retained. Different version of ALBERT pre-trained model for Chinese, including TensorFlow, PyTorch and Keras, is available now. 海量中文语料上预训练ALBERT模型：参数更少，效果更好。预训练小模型也能拿下13项NLP任务，ALBERT三大改造登顶GLUE基准更多数据集、基线模型、不同任务上模型效果的详细对比，见<a href="https://github.com/chineseGLUE/chineseGLUE">中文任务基准测评chineseGLUE</a> <img src="https://github.com/brightmart/albert_zh/blob/master/resources/albert_tiny_compare_s.jpg" width="90%" height="70%" /> 模型下载 Download Pre-trained Models of Chinese ----------------------------------------------- 1、<a href="https://storage.googleapis.com/albert_zh/albert_tiny.zip">albert_tiny_zh</a>, 文件大小16M、参数为1.8M 训练和推理预测速度提升约10倍，精度基本保留，模型大小为bert的1/25；语义相似度数据集LCQMC测试集上达到85.4%，相比bert_base仅下降1.5个点。 lcqmc训练使用如下参数： --max_seq_length=128 --train_batch_size=64 --learning_rate=1e-4 --num_train_epochs=5 albert_tiny使用同样的大规模中文语料数据，层数仅为4层、hidden size等向量维度大幅减少。【使用场景】任务相对比较简单一些或实时性要求高的任务，如语义相似度等句子对任务、分类任务；比较难的任务如阅读理解等，可以使用其他大模型。 2、<a href="https://storage.googleapis.com/albert_zh/albert_large_zh.zip">albert_large_zh</a>,参数量，层数24，文件大小为64M 参数量和模型大小为bert_base的六分之一；在口语化描述相似性数据集LCQMC的测试集上相比bert_base上升0.2个点 3、<a href="https://storage.googleapis.com/albert_zh/albert_base_zh_additional_36k_steps.zip">albert_base_zh(额外训练了1.5亿个实例即 36k steps * batch_size 4096)</a>; <a href="https://storage.googleapis.com/albert_zh/albert_base_zh.zip"> albert_base_zh(小模型体验版)</a>, 参数量12M, 层数12，大小为40M 参数量为bert_base的十分之一，模型大小也十分之一；在口语化描述相似性数据集LCQMC的测试集上相比bert_base下降约0.6~1个点；相比未预训练，albert_base提升14个点 4、<a href="https://storage.googleapis.com/albert_zh/albert_xlarge_zh_177k.zip">albert_xlarge_zh_177k </a>; <a href="https://storage.googleapis.com/albert_zh/albert_xlarge_zh_183k.zip">albert_xlarge_zh_183k(优先尝试)</a>参数量，层数24，文件大小为230M 参数量和模型大小为bert_base的二分之一；需要一张大的显卡；完整测试对比将后续添加；batch_size不能太小，否则可能影响精度 Updates ----------------------------------------------- **\*\*\*\*\* 2019-10-15: albert_tiny_zh, 10 times fast than bert base for training and inference, accuracy remains \*\*\*\*\*** **\*\*\*\*\* 2019-10-07: more models of albert \*\*\*\*\*** add albert_xlarge_zh; albert_base_zh_additional_steps, training with more instances **\*\*\*\*\* 2019-10-04: PyTorch and Keras versions of albert were supported \*\*\*\*\*** a.Convert to PyTorch version and do your tasks through <a href="https://github.com/lonePatient/albert_pytorch">albert_pytorch</a> b.Load pre-trained model with keras using one line of codes through <a href="https://github.com/bojone/bert4keras">bert4keras</a> c.Use albert with TensorFlow 2.0: Use or load pre-trained model with tf2.0 through <a href="https://github.com/kpe/bert-for-tf2">bert-for-tf2</a> Releasing albert_xlarge on 6th Oct **\*\*\*\*\* 2019-10-02: albert_large_zh,albert_base_zh \*\*\*\*\*** Relesed albert_base_zh with only 10% parameters of bert_base, a small model(40M) & training can be very fast. Relased albert_large_zh with only 16% parameters of bert_base(64M) **\*\*\*\*\* 2019-09-28: codes and test functions \*\*\*\*\*** Add codes and test functions for three main changes of albert from bert ALBERT模型介绍 Introduction of ALBERT ----------------------------------------------- ALBERT模型是BERT的改进版，与最近其他State of the art的模型不同的是，这次是预训练小模型，效果更好、参数更少。它对BERT进行了三个改造 Three main changes of ALBert from Bert： 1）词嵌入向量参数的因式分解 Factorized embedding parameterization O(V * H) to O(V * E + E * H) 如以ALBert_xxlarge为例，V=30000, H=4096, E=128 那么原先参数为V * H= 30000 * 4096 = 1.23亿个参数，现在则为V * E + E * H = 30000*128+128*4096 = 384万 + 52万 = 436万，词嵌入相关的参数变化前是变换后的28倍。 2）跨层参数共享 Cross-Layer Parameter Sharing 参数共享能显著减少参数。共享可以分为全连接层、注意力层的参数共享；注意力层的参数对效果的减弱影响小一点。 3）段落连续性任务 Inter-sentence coherence loss. 使用段落连续性任务。正例，使用从一个文档中连续的两个文本段落；负例，使用从一个文档中连续的两个文本段落，但位置调换了。避免使用原有的NSP任务，原有的任务包含隐含了预测主题这类过于简单的任务。 We maintain that inter-sentence modeling is an important aspect of language understanding, but we propose a loss based primarily on coherence. That is, for ALBERT, we use a sentence-order prediction (SOP) loss, which avoids topic prediction and instead focuses on modeling inter-sentence coherence. The SOP loss uses as positive examples the same technique as BERT (two consecutive segments from the same document), and as negative examples the same two consecutive segments but with their order swapped. This forces the model to learn finer-grained distinctions about discourse-level coherence properties. 其他变化，还有 Other changes： 1）去掉了dropout Remove dropout to enlarge capacity of model. 最大的模型，训练了1百万步后，还是没有过拟合训练数据。说明模型的容量还可以更大，就移除了dropout （dropout可以认为是随机的去掉网络中的一部分，同时使网络变小一些） We also note that, even after training for 1M steps, our largest models still do not overfit to their training data. As a result, we decide to remove dropout to further increase our model capacity. 其他型号的模型，在我们的实现中我们还是会保留原始的dropout的比例，防止模型对训练数据的过拟合。 2）为加快训练速度，使用LAMB做为优化器 Use LAMB as optimizer, to train with big batch size 使用了大的batch_size来训练(4096)。 LAMB优化器使得我们可以训练，特别大的批次batch_size，如高达6万。 3）使用n-gram(uni-gram,bi-gram, tri-gram）来做遮蔽语言模型 Use n-gram as make language model 即以不同的概率使用n-gram,uni-gram的概率最大，bi-gram其次，tri-gram概率最小。本项目中目前使用的是在中文上做whole word mask，稍后会更新一下与n-gram mask的效果对比。n-gram从spanBERT中来。训练语料/训练配置 Training Data & Configuration ----------------------------------------------- 30g中文语料，超过100亿汉字，包括多个百科、新闻、互动社区。预训练序列长度sequence_length设置为512，批次batch_size为4096，训

评论收藏

内容反馈