# BERT
**\*\*\*\*\* New November 15th, 2018: SOTA SQuAD 2.0 System \*\*\*\*\***
We released code changes to reproduce our 83% F1 SQuAD 2.0 system, which is
currently 1st place on the leaderboard by 3%. See the SQuAD 2.0 section of the
README for details.
**\*\*\*\*\* New November 5th, 2018: Third-party PyTorch and Chainer versions of
BERT available \*\*\*\*\***
NLP researchers from HuggingFace made a
[PyTorch version of BERT available](https://github.com/huggingface/pytorch-pretrained-BERT)
which is compatible with our pre-trained checkpoints and is able to reproduce
our results. Sosuke Kobayashi also made a
[Chainer version of BERT available](https://github.com/soskek/bert-chainer)
(Thanks!) We were not involved in the creation or maintenance of the PyTorch
implementation so please direct any questions towards the authors of that
repository.
**\*\*\*\*\* New November 3rd, 2018: Multilingual and Chinese models available
\*\*\*\*\***
We have made two new BERT models available:
* **[`BERT-Base, Multilingual`](https://storage.googleapis.com/bert_models/2018_11_03/multilingual_L-12_H-768_A-12.zip)**:
102 languages, 12-layer, 768-hidden, 12-heads, 110M parameters
* **[`BERT-Base, Chinese`](https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip)**:
Chinese Simplified and Traditional, 12-layer, 768-hidden, 12-heads, 110M
parameters
We use character-based tokenization for Chinese, and WordPiece tokenization for
all other languages. Both models should work out-of-the-box without any code
changes. We did update the implementation of `BasicTokenizer` in
`tokenization.py` to support Chinese character tokenization, so please update if
you forked it. However, we did not change the tokenization API.
For more, see the
[Multilingual README](https://github.com/google-research/bert/blob/master/multilingual.md).
**\*\*\*\*\* End new information \*\*\*\*\***
## Introduction
**BERT**, or **B**idirectional **E**ncoder **R**epresentations from
**T**ransformers, is a new method of pre-training language representations which
obtains state-of-the-art results on a wide array of Natural Language Processing
(NLP) tasks.
Our academic paper which describes BERT in detail and provides full results on a
number of tasks can be found here:
[https://arxiv.org/abs/1810.04805](https://arxiv.org/abs/1810.04805).
To give a few numbers, here are the results on the
[SQuAD v1.1](https://rajpurkar.github.io/SQuAD-explorer/) question answering
task:
SQuAD v1.1 Leaderboard (Oct 8th 2018) | Test EM | Test F1
------------------------------------- | :------: | :------:
1st Place Ensemble - BERT | **87.4** | **93.2**
2nd Place Ensemble - nlnet | 86.0 | 91.7
1st Place Single Model - BERT | **85.1** | **91.8**
2nd Place Single Model - nlnet | 83.5 | 90.1
And several natural language inference tasks:
System | MultiNLI | Question NLI | SWAG
----------------------- | :------: | :----------: | :------:
BERT | **86.7** | **91.1** | **86.3**
OpenAI GPT (Prev. SOTA) | 82.2 | 88.1 | 75.0
Plus many other tasks.
Moreover, these results were all obtained with almost no task-specific neural
network architecture design.
If you already know what BERT is and you just want to get started, you can
[download the pre-trained models](#pre-trained-models) and
[run a state-of-the-art fine-tuning](#fine-tuning-with-bert) in only a few
minutes.
## What is BERT?
BERT is a method of pre-training language representations, meaning that we train
a general-purpose "language understanding" model on a large text corpus (like
Wikipedia), and then use that model for downstream NLP tasks that we care about
(like question answering). BERT outperforms previous methods because it is the
first *unsupervised*, *deeply bidirectional* system for pre-training NLP.
*Unsupervised* means that BERT was trained using only a plain text corpus, which
is important because an enormous amount of plain text data is publicly available
on the web in many languages.
Pre-trained representations can also either be *context-free* or *contextual*,
and contextual representations can further be *unidirectional* or
*bidirectional*. Context-free models such as
[word2vec](https://www.tensorflow.org/tutorials/representation/word2vec) or
[GloVe](https://nlp.stanford.edu/projects/glove/) generate a single "word
embedding" representation for each word in the vocabulary, so `bank` would have
the same representation in `bank deposit` and `river bank`. Contextual models
instead generate a representation of each word that is based on the other words
in the sentence.
BERT was built upon recent work in pre-training contextual representations —
including [Semi-supervised Sequence Learning](https://arxiv.org/abs/1511.01432),
[Generative Pre-Training](https://blog.openai.com/language-unsupervised/),
[ELMo](https://allennlp.org/elmo), and
[ULMFit](http://nlp.fast.ai/classification/2018/05/15/introducting-ulmfit.html)
— but crucially these models are all *unidirectional* or *shallowly
bidirectional*. This means that each word is only contextualized using the words
to its left (or right). For example, in the sentence `I made a bank deposit` the
unidirectional representation of `bank` is only based on `I made a` but not
`deposit`. Some previous work does combine the representations from separate
left-context and right-context models, but only in a "shallow" manner. BERT
represents "bank" using both its left and right context — `I made a ... deposit`
— starting from the very bottom of a deep neural network, so it is *deeply
bidirectional*.
BERT uses a simple approach for this: We mask out 15% of the words in the input,
run the entire sequence through a deep bidirectional
[Transformer](https://arxiv.org/abs/1706.03762) encoder, and then predict only
the masked words. For example:
```
Input: the man went to the [MASK1] . he bought a [MASK2] of milk.
Labels: [MASK1] = store; [MASK2] = gallon
```
In order to learn relationships between sentences, we also train on a simple
task which can be generated from any monolingual corpus: Given two sentences `A`
and `B`, is `B` the actual next sentence that comes after `A`, or just a random
sentence from the corpus?
```
Sentence A: the man went to the store .
Sentence B: he bought a gallon of milk .
Label: IsNextSentence
```
```
Sentence A: the man went to the store .
Sentence B: penguins are flightless .
Label: NotNextSentence
```
We then train a large model (12-layer to 24-layer Transformer) on a large corpus
(Wikipedia + [BookCorpus](http://yknzhu.wixsite.com/mbweb)) for a long time (1M
update steps), and that's BERT.
Using BERT has two stages: *Pre-training* and *fine-tuning*.
**Pre-training** is fairly expensive (four days on 4 to 16 Cloud TPUs), but is a
one-time procedure for each language (current models are English-only, but
multilingual models will be released in the near future). We are releasing a
number of pre-trained models from the paper which were pre-trained at Google.
Most NLP researchers will never need to pre-train their own model from scratch.
**Fine-tuning** is inexpensive. All of the results in the paper can be
replicated in at most 1 hour on a single Cloud TPU, or a few hours on a GPU,
starting from the exact same pre-trained model. SQuAD, for example, can be
trained in around 30 minutes on a single Cloud TPU to achieve a Dev F1 score of
91.0%, which is the single system state-of-the-art.
The other important aspect of BERT is that it can be adapted to many types of
NLP tasks very easily. In the paper, we demonstrate state-of-the-art results on
sentence-level (e.g., SST-2), sentence-pair-level (e.g., MultiNLI), word-level
(e.g., NER), and span-level (e.g., SQuAD) tasks with almost no task-specific
modifications.
## What has been released in this repository?
We are releasing the following:
* TensorFlow code for the BERT model architecture (
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
<项目介绍> 为能最简化使用该系统,不需要繁杂的部署各种七七八八的东西,当前版本使用的itchat将问答功能集成到微信做演示,这需要你的微信能登入网页微信才能使用itchat;另外对话上下文并没有使用Redis之类的数据库实时存储到内存里,而是使用json文件的存、读进行的。 能够回答哪些疾病相关知识,可以参考这个疾病实体字典里的疾病;另外目前并没有实现实体链指 - 不懂运行,下载完可以私聊问,可远程教学 该资源内项目源码是个人的毕设,代码都测试ok,都是运行成功后才上传资源,答辩评审平均分达到96分,放心下载使用! 1、该资源内项目代码都经过测试运行成功,功能ok的情况下才上传的,请放心下载使用! 2、本项目适合计算机相关专业(如计科、人工智能、通信工程、自动化、电子信息等)的在校学生、老师或者企业员工下载学习,也适合小白学习进阶,当然也可作为毕设项目、课程设计、作业、项目初期立项演示等。 3、如果基础还行,也可在此代码基础上进行修改,以实现其他功能,也可用于毕设、课设、作业等。 下载后请首先打开README.md文件(如有),仅供学习参考, 切勿用于商业用途。 --------
资源推荐
资源详情
资源评论
收起资源包目录
基于知识图谱和向量检索实现的医疗诊断问答系统python源码+文档说明+模型+演示视频 (188个子文件)
run_ner_service.bat 64B
run_intent_recog_service.bat 60B
train.csv 7.02MB
train.csv 4.22MB
test.csv 718KB
dev.csv 715KB
train.csv 599KB
test.csv 67KB
test.csv 67KB
.gitignore 0B
best_bilstm_crf_model.h5 5.81MB
bilstm_crf_model.h5 5.81MB
seq.in 603KB
seq.in 283KB
seq.in 51KB
seq.in 33KB
seq.in 32KB
seq.in 32KB
in_vocab 94KB
intent_vocab 118B
微信图片_20210604135949.jpg 202KB
微信图片_20210604135943.jpg 180KB
微信图片_20210604135937.jpg 116KB
itchat登入.jpg 103KB
zsxq.jpg 89KB
diseases.json 234KB
eval.json 207KB
bert_sim_model.json 44KB
bert_sim_encoder.json 42KB
label 180KB
label 56KB
label 11KB
label 10KB
label 10KB
label 6KB
label 165B
LICENSE 11KB
README.md 40KB
实体关系抽取代表性SOTA论文速读.md 18KB
README.md 17KB
multilingual.md 11KB
介绍几个意图识别和槽位填充联合训练模型.md 8KB
README.MD 5KB
README.MD 3KB
CONTRIBUTING.md 1KB
seq.out 929KB
seq.out 406KB
seq.out 79KB
seq.out 50KB
seq.out 49KB
seq.out 47KB
2020.emnlp-main.116.pdf 916KB
A Novel Cascade Binary Tagging Framework for.pdf 704KB
A Unified MRC Framework for Named Entity Recognition.pdf 675KB
TPLinker.pdf 489KB
gbdt.pkl 5.62MB
vec.pkl 65KB
LR.pkl 55KB
svc_clf.pkl 47KB
word_tag_id.pkl 37KB
id2label.pkl 98B
conlleval.pl 13KB
casrel.png 682KB
tplinker2.png 406KB
spent.png 329KB
tplinker1.png 296KB
mintent3.png 283KB
4.png 273KB
mintent2.png 260KB
5.png 243KB
mrcquery.png 184KB
atis-res.png 165KB
atis-res.png 165KB
spent2.png 148KB
spanlen.png 146KB
bert-slot.png 145KB
slot2.png 134KB
slot1.png 116KB
snips-res.png 100KB
snips-res.png 100KB
mintent.png 98KB
mrc.png 82KB
mrcqa.png 82KB
attn1.png 69KB
sgate.png 37KB
zhizhen.png 29KB
run_squad.py 45KB
modeling.py 37KB
crf_layer.py 32KB
run_classifier.py 31KB
__init__.py 30KB
bert_lstm_ner.py 27KB
thu_classification.py 25KB
extract_features.py 19KB
__init__.py 18KB
run_pretraining.py 18KB
graph.py 17KB
create_pretraining_data.py 15KB
metrics.py 14KB
metrics.py 14KB
共 188 条
- 1
- 2
资源评论
机智的程序员zero
- 粉丝: 2258
- 资源: 4291
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功