BILSTM+CRFFORNER_Semi-NER-CRF-KNN资源-CSDN文库

共28个文件

txt：6个

py：6个

pyc：4个

需积分: 50 136 浏览量 2018-09-03 11:24:59 上传评论 2 收藏 39.47MB ZIP 举报

**标题与描述解析** 标题“BILSTM + CRF FOR NER”中提到的核心技术是双向长短期记忆网络（Bidirectional Long Short-Term Memory, BILSTM）与条件随机场（Conditional Random Field, CRF），这两者在命名实体识别（Named Entity Recognition, NER）任务中被广泛应用。NER是自然语言处理（Natural Language Processing, NLP）中的一个重要子领域，旨在从文本中自动识别出具有特定意义的实体，如人名、地名、组织名等。 **BILSTM详解** BILSTM是LSTM（Long Short-Term Memory）的一种变体，LSTM是一种递归神经网络（Recurrent Neural Network, RNN）的结构，用于处理序列数据。LSTM解决了传统RNN中的梯度消失和梯度爆炸问题，允许模型学习长期依赖性。BILSTM则是在前向LSTM的基础上增加了反向LSTM，同时考虑了序列的前向和后向上下文信息，从而更好地捕获文本中的语义信息。 **CRF详解** CRF是一种统计建模方法，常用于序列标注任务，如NER。它通过考虑当前预测实体与前后实体之间的关系，来提升整体的标注准确性。与单独使用softmax函数进行分类不同，CRF可以考虑到整个序列的最优解，避免了孤立地预测每个位置的标签可能导致的错误传播问题。 **NER任务** NER是NLP中的基础任务之一，其目的是从文本中识别出具有特定意义的实体，并对其进行分类。例如，识别出人名（PER）、地名（LOC）、组织名（ORG）等。这项任务在信息抽取、问答系统、机器翻译等领域有着广泛的应用。 **在TF框架下实现** 文件名为"zh-NER-TF-master"表明这是一个使用TensorFlow框架实现的中文NER项目。TensorFlow是一个强大的开源库，用于构建和训练深度学习模型。这个项目可能包含了从数据预处理、模型构建（BILSTM+CRF）、模型训练到结果评估的完整流程。 **项目可能包含的组件** 1. **数据集**：可能包括预处理的中文NER数据，如CONLL格式的标注语料。 2. **预处理脚本**：用于将原始数据转化为模型可读的格式。 3. **模型定义**：BILSTM和CRF层的定义，以及它们如何组合以解决NER问题。 4. **训练脚本**：定义训练过程，包括损失函数、优化器和训练循环。 5. **评估脚本**：计算模型在验证集或测试集上的性能指标，如精确率、召回率和F1分数。 6. **模型保存与加载**：保存训练好的模型以便后续使用或继续训练。 **总结** BILSTM+CRF的组合在NER任务中表现出色，能够有效地利用序列信息进行实体识别。这个项目可能提供了一个完整的基于TensorFlow的解决方案，涵盖了从数据处理到模型训练和评估的全过程，对于理解并实践这一技术有很高的参考价值。

资源详情

资源评论

资源推荐

收起资源包目录

zh-NER-TF-master.zip （28个子文件）

zh-NER-TF-master

data_path

word2id.pkl 60KB

original

train1.txt 9.99MB

test1.txt 514KB

link.txt 49B

testright1.txt 564KB

test_data 1.06MB

train_data 13.26MB

data_path_save

1521112368

checkpoints

model-31680.meta 5.06MB

model-31680.data-00000-of-00001 29.96MB

checkpoint 79B

model-31680.index 1KB

results

log.txt 9KB

summaries

data.py 8KB

pics

pic1.png 768KB

pic2.png 284KB

demo.txt 961B

utils.py 3KB

conlleval_rev.pl 12KB

main.py 8KB

model.py 14KB

eval.py 848B

README.md 3KB

__pycache__

eval.cpython-36.pyc 912B

utils.cpython-36.pyc 2KB

model.cpython-36.pyc 10KB

data.cpython-36.pyc 4KB

01实验.py 2KB

.gitignore 28B

## A simple BiLSTM-CRF model for Chinese Named Entity Recognition This repository includes the code for buliding a very simple __character-based BiLSTM-CRF sequence labelling model__ for Chinese Named Entity Recognition task. Its goal is to recognize three types of Named Entity: PERSON, LOCATION and ORGANIZATION. This code works on __Python 3 & TensorFlow 1.2__ and the following repository [https://github.com/guillaumegenthial/sequence_tagging](https://github.com/guillaumegenthial/sequence_tagging) gives me much help. ### model This model is similar to the models provied by paper [1] and [2]. Its structure looks just like the following illustration: ![Network](./pics/pic1.png) For one Chinese sentence, each character in this sentence has / will have a tag which belongs to the set {O, B-PER, I-PER, B-LOC, I-LOC, B-ORG, I-ORG}. The first layer, __look-up layer__, aims at transforming character representation from one-hot vector into *character embedding*. In this code I initialize the embedding matrix randomly and I know it looks too simple. We could add some language knowledge later. For example, do tokenization and use pre-trained word-level embedding, then every character in one token could be initialized with this token's word embedding. In addition, we can get the character embedding by combining low-level features (please see paper[2]'s section 4.1 and paper[3]'s section 3.3 for more details). The second layer, __BiLSTM layer__, can efficiently use *both past and future* input information and extract features automatically. The third layer, __CRF layer__, labels the tag for each character in one sentence. If we use Softmax layer for labelling we might get ungrammatic tag sequences beacuse Softmax could only label each position independently. We know that 'I-LOC' cannot follow 'B-PER' but Softmax don't know. Compared to Softmax layer, CRF layer could use *sentence-level tag information* and model the transition behavior of each two different tags. ### dataset | | #sentence | #PER | #LOC | #ORG | | :----: | :---: | :---: | :---: | :---: | | train | 46364 | 17615 | 36517 | 20571 | | test | 4365 | 1973 | 2877 | 1331 | It looks like a portion of [MSRA corpus](http://sighan.cs.uchicago.edu/bakeoff2006/). ### train `python main.py --mode=train ` ### test `python main.py --mode=test --demo_model=1521112368` Please set the parameter `--demo_model` to the model which you want to test. `1521112368` is the model trained by me. An official evaluation tool: [here (click 'Instructions')](http://sighan.cs.uchicago.edu/bakeoff2006/) My test performance: | P | R | F | F (PER)| F (LOC)| F (ORG)| | :---: | :---: | :---: | :---: | :---: | :---: | | 0.8945 | 0.8752 | 0.8847 | 0.8688 | 0.9118 | 0.8515 ### demo `python main.py --mode=demo --demo_model=1521112368` You can input one Chinese sentence and the model will return the recognition result: ![demo_pic](./pics/pic2.png) ### references \[1\] [Bidirectional LSTM-CRF Models for Sequence Tagging](https://arxiv.org/pdf/1508.01991v1.pdf) \[2\] [Neural Architectures for Named Entity Recognition](http://aclweb.org/anthology/N16-1030) \[3\] [Character-Based LSTM-CRF with Radical-Level Features for Chinese Named Entity Recognition](http://www.nlpr.ia.ac.cn/cip/ZhangPublications/dong-nlpcc-2016.pdf) \[4\] [https://github.com/guillaumegenthial/sequence_tagging](https://github.com/guillaumegenthial/sequence_tagging)