# BERT
**\*\*\*\*\* New November 15th, 2018: SOTA SQuAD 2.0 System \*\*\*\*\***
We released code changes to reproduce our 83% F1 SQuAD 2.0 system, which is
currently 1st place on the leaderboard by 3%. See the SQuAD 2.0 section of the
README for details.
**\*\*\*\*\* New November 5th, 2018: Third-party PyTorch and Chainer versions of
BERT available \*\*\*\*\***
NLP researchers from HuggingFace made a
[PyTorch version of BERT available](https://github.com/huggingface/pytorch-pretrained-BERT)
which is compatible with our pre-trained checkpoints and is able to reproduce
our results. Sosuke Kobayashi also made a
[Chainer version of BERT available](https://github.com/soskek/bert-chainer)
(Thanks!) We were not involved in the creation or maintenance of the PyTorch
implementation so please direct any questions towards the authors of that
repository.
**\*\*\*\*\* New November 3rd, 2018: Multilingual and Chinese models available
\*\*\*\*\***
We have made two new BERT models available:
* **[`BERT-Base, Multilingual`](https://storage.googleapis.com/bert_models/2018_11_03/multilingual_L-12_H-768_A-12.zip)**:
102 languages, 12-layer, 768-hidden, 12-heads, 110M parameters
* **[`BERT-Base, Chinese`](https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip)**:
Chinese Simplified and Traditional, 12-layer, 768-hidden, 12-heads, 110M
parameters
We use character-based tokenization for Chinese, and WordPiece tokenization for
all other languages. Both models should work out-of-the-box without any code
changes. We did update the implementation of `BasicTokenizer` in
`tokenization.py` to support Chinese character tokenization, so please update if
you forked it. However, we did not change the tokenization API.
For more, see the
[Multilingual README](https://github.com/google-research/bert/blob/master/multilingual.md).
**\*\*\*\*\* End new information \*\*\*\*\***
## Introduction
**BERT**, or **B**idirectional **E**ncoder **R**epresentations from
**T**ransformers, is a new method of pre-training language representations which
obtains state-of-the-art results on a wide array of Natural Language Processing
(NLP) tasks.
Our academic paper which describes BERT in detail and provides full results on a
number of tasks can be found here:
[https://arxiv.org/abs/1810.04805](https://arxiv.org/abs/1810.04805).
To give a few numbers, here are the results on the
[SQuAD v1.1](https://rajpurkar.github.io/SQuAD-explorer/) question answering
task:
SQuAD v1.1 Leaderboard (Oct 8th 2018) | Test EM | Test F1
------------------------------------- | :------: | :------:
1st Place Ensemble - BERT | **87.4** | **93.2**
2nd Place Ensemble - nlnet | 86.0 | 91.7
1st Place Single Model - BERT | **85.1** | **91.8**
2nd Place Single Model - nlnet | 83.5 | 90.1
And several natural language inference tasks:
System | MultiNLI | Question NLI | SWAG
----------------------- | :------: | :----------: | :------:
BERT | **86.7** | **91.1** | **86.3**
OpenAI GPT (Prev. SOTA) | 82.2 | 88.1 | 75.0
Plus many other tasks.
Moreover, these results were all obtained with almost no task-specific neural
network architecture design.
If you already know what BERT is and you just want to get started, you can
[download the pre-trained models](#pre-trained-models) and
[run a state-of-the-art fine-tuning](#fine-tuning-with-bert) in only a few
minutes.
## What is BERT?
BERT is a method of pre-training language representations, meaning that we train
a general-purpose "language understanding" model on a large text corpus (like
Wikipedia), and then use that model for downstream NLP tasks that we care about
(like question answering). BERT outperforms previous methods because it is the
first *unsupervised*, *deeply bidirectional* system for pre-training NLP.
*Unsupervised* means that BERT was trained using only a plain text corpus, which
is important because an enormous amount of plain text data is publicly available
on the web in many languages.
Pre-trained representations can also either be *context-free* or *contextual*,
and contextual representations can further be *unidirectional* or
*bidirectional*. Context-free models such as
[word2vec](https://www.tensorflow.org/tutorials/representation/word2vec) or
[GloVe](https://nlp.stanford.edu/projects/glove/) generate a single "word
embedding" representation for each word in the vocabulary, so `bank` would have
the same representation in `bank deposit` and `river bank`. Contextual models
instead generate a representation of each word that is based on the other words
in the sentence.
BERT was built upon recent work in pre-training contextual representations —
including [Semi-supervised Sequence Learning](https://arxiv.org/abs/1511.01432),
[Generative Pre-Training](https://blog.openai.com/language-unsupervised/),
[ELMo](https://allennlp.org/elmo), and
[ULMFit](http://nlp.fast.ai/classification/2018/05/15/introducting-ulmfit.html)
— but crucially these models are all *unidirectional* or *shallowly
bidirectional*. This means that each word is only contextualized using the words
to its left (or right). For example, in the sentence `I made a bank deposit` the
unidirectional representation of `bank` is only based on `I made a` but not
`deposit`. Some previous work does combine the representations from separate
left-context and right-context models, but only in a "shallow" manner. BERT
represents "bank" using both its left and right context — `I made a ... deposit`
— starting from the very bottom of a deep neural network, so it is *deeply
bidirectional*.
BERT uses a simple approach for this: We mask out 15% of the words in the input,
run the entire sequence through a deep bidirectional
[Transformer](https://arxiv.org/abs/1706.03762) encoder, and then predict only
the masked words. For example:
```
Input: the man went to the [MASK1] . he bought a [MASK2] of milk.
Labels: [MASK1] = store; [MASK2] = gallon
```
In order to learn relationships between sentences, we also train on a simple
task which can be generated from any monolingual corpus: Given two sentences `A`
and `B`, is `B` the actual next sentence that comes after `A`, or just a random
sentence from the corpus?
```
Sentence A: the man went to the store .
Sentence B: he bought a gallon of milk .
Label: IsNextSentence
```
```
Sentence A: the man went to the store .
Sentence B: penguins are flightless .
Label: NotNextSentence
```
We then train a large model (12-layer to 24-layer Transformer) on a large corpus
(Wikipedia + [BookCorpus](http://yknzhu.wixsite.com/mbweb)) for a long time (1M
update steps), and that's BERT.
Using BERT has two stages: *Pre-training* and *fine-tuning*.
**Pre-training** is fairly expensive (four days on 4 to 16 Cloud TPUs), but is a
one-time procedure for each language (current models are English-only, but
multilingual models will be released in the near future). We are releasing a
number of pre-trained models from the paper which were pre-trained at Google.
Most NLP researchers will never need to pre-train their own model from scratch.
**Fine-tuning** is inexpensive. All of the results in the paper can be
replicated in at most 1 hour on a single Cloud TPU, or a few hours on a GPU,
starting from the exact same pre-trained model. SQuAD, for example, can be
trained in around 30 minutes on a single Cloud TPU to achieve a Dev F1 score of
91.0%, which is the single system state-of-the-art.
The other important aspect of BERT is that it can be adapted to many types of
NLP tasks very easily. In the paper, we demonstrate state-of-the-art results on
sentence-level (e.g., SST-2), sentence-pair-level (e.g., MultiNLI), word-level
(e.g., NER), and span-level (e.g., SQuAD) tasks with almost no task-specific
modifications.
## What has been released in this repository?
We are releasing the following:
* TensorFlow code for the BERT model architecture (
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
基于知识图谱和向量检索的医疗诊断问答系统 环境 python >= 3.6 pyahocorasick==1.4.2 requests==2.25.1 gevent==1.4.0 jieba==0.42.1 six==1.15.0 gensim==3.8.3 matplotlib==3.1.3 Flask==1.1.1 numpy==1.16.0 bert4keras==0.9.1 tensorflow==1.14.0 Keras==2.3.1 py2neo==2020.1.1 tqdm==4.42.1 pandas==1.0.1 termcolor==1.1.0 itchat==1.3.10 ahocorasick==0.9 flask_compress==1.9.0 flask_cors==3.0.10 flask_json==0.3.4 GPUtil==1.4.0 pyzmq==22.0.3 scikit_learn==0.24.1
资源推荐
资源详情
资源评论
收起资源包目录
人工智能-信息检索-检索系统-基于知识图谱和向量检索的医疗诊断问答系统 (187个子文件)
run_ner_service.bat 64B
run_intent_recog_service.bat 60B
train.csv 7.02MB
train.csv 4.22MB
test.csv 718KB
dev.csv 715KB
train.csv 599KB
test.csv 67KB
test.csv 67KB
.gitignore 0B
best_bilstm_crf_model.h5 5.81MB
bilstm_crf_model.h5 5.81MB
seq.in 603KB
seq.in 283KB
seq.in 51KB
seq.in 33KB
seq.in 32KB
seq.in 32KB
in_vocab 94KB
intent_vocab 118B
微信图片_20210604135949.jpg 202KB
微信图片_20210604135943.jpg 180KB
微信图片_20210604135937.jpg 116KB
itchat登入.jpg 103KB
zsxq.jpg 89KB
diseases.json 234KB
eval.json 207KB
bert_sim_model.json 44KB
bert_sim_encoder.json 42KB
label 180KB
label 56KB
label 11KB
label 10KB
label 10KB
label 6KB
label 165B
LICENSE 11KB
README.md 40KB
实体关系抽取代表性SOTA论文速读.md 18KB
README.md 17KB
multilingual.md 11KB
介绍几个意图识别和槽位填充联合训练模型.md 8KB
README.MD 3KB
CONTRIBUTING.md 1KB
seq.out 929KB
seq.out 406KB
seq.out 79KB
seq.out 50KB
seq.out 49KB
seq.out 47KB
2020.emnlp-main.116.pdf 916KB
A Novel Cascade Binary Tagging Framework for.pdf 704KB
A Unified MRC Framework for Named Entity Recognition.pdf 675KB
TPLinker.pdf 489KB
gbdt.pkl 5.62MB
vec.pkl 65KB
LR.pkl 55KB
svc_clf.pkl 47KB
word_tag_id.pkl 37KB
id2label.pkl 98B
conlleval.pl 13KB
casrel.png 682KB
tplinker2.png 406KB
spent.png 329KB
tplinker1.png 296KB
mintent3.png 283KB
4.png 273KB
mintent2.png 260KB
5.png 243KB
mrcquery.png 184KB
atis-res.png 165KB
atis-res.png 165KB
spent2.png 148KB
spanlen.png 146KB
bert-slot.png 145KB
slot2.png 134KB
slot1.png 116KB
snips-res.png 100KB
snips-res.png 100KB
mintent.png 98KB
mrc.png 82KB
mrcqa.png 82KB
attn1.png 69KB
sgate.png 37KB
zhizhen.png 29KB
run_squad.py 45KB
modeling.py 37KB
crf_layer.py 32KB
run_classifier.py 31KB
__init__.py 30KB
bert_lstm_ner.py 27KB
thu_classification.py 25KB
extract_features.py 19KB
__init__.py 18KB
run_pretraining.py 18KB
graph.py 17KB
create_pretraining_data.py 15KB
metrics.py 14KB
metrics.py 14KB
build_kg_utils.py 13KB
共 187 条
- 1
- 2
资源评论
博士僧小星
- 粉丝: 2430
- 资源: 5997
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- Matlab_SPM统计参数映射开发版本.zip
- Matlab_VBA工具箱.zip
- Matlab_UBC 3视图数据集的Matlab工具包.zip
- Matlab_TOMM2020双路径卷积图像文本嵌入.zip
- Matlab_vtkwrite将3D Matlab数组写入VTK文件格式.zip
- Matlab_VMDMFRFNN.zip
- Matlab_Vitruvio是一个用于腿式机器人快速腿设计分析和优化的框架,该仿真框架的目的是指导腿式机器人设计的早.zip
- Matlab_XKCDIFY为无聊的Matlab坐标轴一次添加一个情节.zip
- Matlab_WB颜色增强器通过仿真不同的WB效果,提高图像分类和图像语义分割方法的准确性.zip
- Matlab_白平衡相机渲染的sRGB图像CVPR 2019 Matlab Python.zip
- Matlab_包含编码各种数字调制方案,如AM DSBSC SSBSC FM BPSK QPSK 16QAM DBPS.zip
- Matlab_半自动锂离子电池RC模型参数估计器.zip
- Matlab_包含我所有的Matlab工具箱.zip
- Matlab_包括simulink模型和MPCController代码Carsim version 802 Matla.zip
- Matlab_包括实际数据采集和理论总结GPS L1CA L2C L5 galileo E1OS E5北斗2 B1I.zip
- Matlab_包括入门指南和示例数据MERIT是一个灵活和可扩展的框架,用于开发测试运行和优化基于雷达的成像算法.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功