# BERT
**\*\*\*\*\* New November 15th, 2018: SOTA SQuAD 2.0 System \*\*\*\*\***
We released code changes to reproduce our 83% F1 SQuAD 2.0 system, which is
currently 1st place on the leaderboard by 3%. See the SQuAD 2.0 section of the
README for details.
**\*\*\*\*\* New November 5th, 2018: Third-party PyTorch and Chainer versions of
BERT available \*\*\*\*\***
NLP researchers from HuggingFace made a
[PyTorch version of BERT available](https://github.com/huggingface/pytorch-pretrained-BERT)
which is compatible with our pre-trained checkpoints and is able to reproduce
our results. Sosuke Kobayashi also made a
[Chainer version of BERT available](https://github.com/soskek/bert-chainer)
(Thanks!) We were not involved in the creation or maintenance of the PyTorch
implementation so please direct any questions towards the authors of that
repository.
**\*\*\*\*\* New November 3rd, 2018: Multilingual and Chinese models available
\*\*\*\*\***
We have made two new BERT models available:
* **[`BERT-Base, Multilingual`](https://storage.googleapis.com/bert_models/2018_11_03/multilingual_L-12_H-768_A-12.zip)**:
102 languages, 12-layer, 768-hidden, 12-heads, 110M parameters
* **[`BERT-Base, Chinese`](https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip)**:
Chinese Simplified and Traditional, 12-layer, 768-hidden, 12-heads, 110M
parameters
We use character-based tokenization for Chinese, and WordPiece tokenization for
all other languages. Both models should work out-of-the-box without any code
changes. We did update the implementation of `BasicTokenizer` in
`tokenization.py` to support Chinese character tokenization, so please update if
you forked it. However, we did not change the tokenization API.
For more, see the
[Multilingual README](https://github.com/google-research/bert/blob/master/multilingual.md).
**\*\*\*\*\* End new information \*\*\*\*\***
## Introduction
**BERT**, or **B**idirectional **E**ncoder **R**epresentations from
**T**ransformers, is a new method of pre-training language representations which
obtains state-of-the-art results on a wide array of Natural Language Processing
(NLP) tasks.
Our academic paper which describes BERT in detail and provides full results on a
number of tasks can be found here:
[https://arxiv.org/abs/1810.04805](https://arxiv.org/abs/1810.04805).
To give a few numbers, here are the results on the
[SQuAD v1.1](https://rajpurkar.github.io/SQuAD-explorer/) question answering
task:
SQuAD v1.1 Leaderboard (Oct 8th 2018) | Test EM | Test F1
------------------------------------- | :------: | :------:
1st Place Ensemble - BERT | **87.4** | **93.2**
2nd Place Ensemble - nlnet | 86.0 | 91.7
1st Place Single Model - BERT | **85.1** | **91.8**
2nd Place Single Model - nlnet | 83.5 | 90.1
And several natural language inference tasks:
System | MultiNLI | Question NLI | SWAG
----------------------- | :------: | :----------: | :------:
BERT | **86.7** | **91.1** | **86.3**
OpenAI GPT (Prev. SOTA) | 82.2 | 88.1 | 75.0
Plus many other tasks.
Moreover, these results were all obtained with almost no task-specific neural
network architecture design.
If you already know what BERT is and you just want to get started, you can
[download the pre-trained models](#pre-trained-models) and
[run a state-of-the-art fine-tuning](#fine-tuning-with-bert) in only a few
minutes.
## What is BERT?
BERT is a method of pre-training language representations, meaning that we train
a general-purpose "language understanding" model on a large text corpus (like
Wikipedia), and then use that model for downstream NLP tasks that we care about
(like question answering). BERT outperforms previous methods because it is the
first *unsupervised*, *deeply bidirectional* system for pre-training NLP.
*Unsupervised* means that BERT was trained using only a plain text corpus, which
is important because an enormous amount of plain text data is publicly available
on the web in many languages.
Pre-trained representations can also either be *context-free* or *contextual*,
and contextual representations can further be *unidirectional* or
*bidirectional*. Context-free models such as
[word2vec](https://www.tensorflow.org/tutorials/representation/word2vec) or
[GloVe](https://nlp.stanford.edu/projects/glove/) generate a single "word
embedding" representation for each word in the vocabulary, so `bank` would have
the same representation in `bank deposit` and `river bank`. Contextual models
instead generate a representation of each word that is based on the other words
in the sentence.
BERT was built upon recent work in pre-training contextual representations —
including [Semi-supervised Sequence Learning](https://arxiv.org/abs/1511.01432),
[Generative Pre-Training](https://blog.openai.com/language-unsupervised/),
[ELMo](https://allennlp.org/elmo), and
[ULMFit](http://nlp.fast.ai/classification/2018/05/15/introducting-ulmfit.html)
— but crucially these models are all *unidirectional* or *shallowly
bidirectional*. This means that each word is only contextualized using the words
to its left (or right). For example, in the sentence `I made a bank deposit` the
unidirectional representation of `bank` is only based on `I made a` but not
`deposit`. Some previous work does combine the representations from separate
left-context and right-context models, but only in a "shallow" manner. BERT
represents "bank" using both its left and right context — `I made a ... deposit`
— starting from the very bottom of a deep neural network, so it is *deeply
bidirectional*.
BERT uses a simple approach for this: We mask out 15% of the words in the input,
run the entire sequence through a deep bidirectional
[Transformer](https://arxiv.org/abs/1706.03762) encoder, and then predict only
the masked words. For example:
```
Input: the man went to the [MASK1] . he bought a [MASK2] of milk.
Labels: [MASK1] = store; [MASK2] = gallon
```
In order to learn relationships between sentences, we also train on a simple
task which can be generated from any monolingual corpus: Given two sentences `A`
and `B`, is `B` the actual next sentence that comes after `A`, or just a random
sentence from the corpus?
```
Sentence A: the man went to the store .
Sentence B: he bought a gallon of milk .
Label: IsNextSentence
```
```
Sentence A: the man went to the store .
Sentence B: penguins are flightless .
Label: NotNextSentence
```
We then train a large model (12-layer to 24-layer Transformer) on a large corpus
(Wikipedia + [BookCorpus](http://yknzhu.wixsite.com/mbweb)) for a long time (1M
update steps), and that's BERT.
Using BERT has two stages: *Pre-training* and *fine-tuning*.
**Pre-training** is fairly expensive (four days on 4 to 16 Cloud TPUs), but is a
one-time procedure for each language (current models are English-only, but
multilingual models will be released in the near future). We are releasing a
number of pre-trained models from the paper which were pre-trained at Google.
Most NLP researchers will never need to pre-train their own model from scratch.
**Fine-tuning** is inexpensive. All of the results in the paper can be
replicated in at most 1 hour on a single Cloud TPU, or a few hours on a GPU,
starting from the exact same pre-trained model. SQuAD, for example, can be
trained in around 30 minutes on a single Cloud TPU to achieve a Dev F1 score of
91.0%, which is the single system state-of-the-art.
The other important aspect of BERT is that it can be adapted to many types of
NLP tasks very easily. In the paper, we demonstrate state-of-the-art results on
sentence-level (e.g., SST-2), sentence-pair-level (e.g., MultiNLI), word-level
(e.g., NER), and span-level (e.g., SQuAD) tasks with almost no task-specific
modifications.
## What has been released in this repository?
We are releasing the following:
* TensorFlow code for the BERT model architecture (
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
Python实现的基于BERT的知识库问答系统源码+全部数据(期末大作业).zip 这是95分以上高分必过大作业设计项目,下载即用无需修改,确保可以运行。也可作为课程设计。 Python实现的基于BERT的知识库问答系统源码+全部数据(期末大作业).zip 这是95分以上高分必过大作业设计项目,下载即用无需修改,确保可以运行。也可作为课程设计。Python实现的基于BERT的知识库问答系统源码+全部数据(期末大作业).zip 这是95分以上高分必过大作业设计项目,下载即用无需修改,确保可以运行。也可作为课程设计。Python实现的基于BERT的知识库问答系统源码+全部数据(期末大作业).zip 这是95分以上高分必过大作业设计项目,下载即用无需修改,确保可以运行。也可作为课程设计。Python实现的基于BERT的知识库问答系统源码+全部数据(期末大作业).zip 这是95分以上高分必过大作业设计项目,下载即用无需修改,确保可以运行。也可作为课程设计。Python实现的基于BERT的知识库问答系统源码+全部数据(期末大作业).zip 这是95分以上高分必过大作业设计项目,下载即用无需
资源推荐
资源详情
资源评论
收起资源包目录
Python实现的基于BERT的知识库问答系统源码.zip (66个子文件)
BERT-KBQA-master
run_similarity.py 28KB
kbqa_test.py 8KB
tf_metrics.py 8KB
Output
label_list.pkl 64B
recommend_articles.log 10KB
conlleval.py 10KB
.idea
vcs.xml 180B
workspace.xml 38KB
misc.xml 299B
modules.xml 266B
my_kbqa.iml 398B
conlleval.pl 13KB
lstm_crf_layer.py 7KB
bert_lstm_ner.py 34KB
terminal_predict.py 17KB
args.py 936B
bert
modeling_test.py 9KB
__init__.py 0B
extract_features.py 14KB
LICENSE 11KB
run_pretraining.py 18KB
sample_text.txt 4KB
CONTRIBUTING.md 1KB
optimization_test.py 2KB
modeling.py 37KB
optimization.py 6KB
tokenization_test.py 4KB
tokenization.py 10KB
requirements.txt 110B
create_pretraining_data.py 15KB
__pycache__
tokenization.cpython-36.pyc 8KB
__init__.cpython-36.pyc 127B
modeling.cpython-36.pyc 25KB
optimization.cpython-36.pyc 4KB
README.md 40KB
multilingual.md 11KB
run_classifier.py 31KB
run_squad.py 45KB
global_config.py 2KB
__pycache__
global_config.cpython-36.pyc 1KB
lstm_crf_layer.cpython-36.pyc 5KB
kbqa_test.cpython-36.pyc 5KB
run_similarity.cpython-36.pyc 19KB
args.cpython-36.pyc 746B
bert_lstm_ner.cpython-36.pyc 23KB
terminal_predict.cpython-36.pyc 12KB
tf_metrics.cpython-36.pyc 7KB
test.py 23KB
README.md 47B
kbqa.py 5KB
Data
load_dbdata.py 4KB
NER_Data
q_t_a_df_testing.csv 1.51MB
training.txt 1.52MB
q_t_a_df_training.csv 2.21MB
testing.txt 1.06MB
Sim_Data
val.txt 720KB
test.txt 3.73MB
train.txt 4.6MB
__pycache__
load_dbdata.cpython-36.pyc 3KB
construct_dataset_attribute.py 2KB
construct_dataset.py 2KB
triple_clean.py 2KB
NLPCC2016KBQA
nlpcc-iccpol-2016.kbqa.kb 29KB
nlpcc-iccpol-2016.kbqa.testing-data 2.16MB
nlpcc-iccpol-2016.kbqa.training-data 3.19MB
data.conf 2B
共 66 条
- 1
程序员张小妍
- 粉丝: 1w+
- 资源: 3257
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 适用于 Android、Java 和 Kotlin Multiplatform 的现代 I,O 库 .zip
- 高通TWS蓝牙规格书,做HIFI级别的耳机用
- Qt读写Usb设备的数据
- 这个存储库适合初学者从 Scratch 开始学习 JavaScript.zip
- AUTOSAR 4.4.0版本Rte模块标准文档
- 25考研冲刺快速复习经验.pptx
- MATLAB使用教程-初步入门大全
- 该存储库旨在为 Web 上的语言提供新信息 .zip
- 考研冲刺的实用经验与技巧.pptx
- Nvidia GeForce GT 1030-GeForce Studio For Win10&Win11(Win10&Win11 GeForce GT 1030显卡驱动)
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
- 1
- 2
前往页