Python库|tf-transformers-1.0.8.tar.gz资源-CSDN文库

版权申诉

44 浏览量 2022-04-16 05:30:14 上传评论收藏 1.25MB GZ 举报

共246个文件

py：178个

ipynb：45个

json：12个

资源推荐

资源详情

资源评论

收起资源包目录

Python库 | tf-transformers-1.0.8.tar.gz （246个子文件）

seq2seq_summarization.ipynb 1.12MB

1_convert_bert_from_huggingface.ipynb 142KB

albert_snips.ipynb 137KB

albert_snips_each_layer.ipynb 136KB

bert_mlm_tfrecord_text.ipynb 97KB

bert_dynamic_mlm.ipynb 86KB

bert_mlm_tfrecord.ipynb 76KB

seq2seq_summarization.ipynb 58KB

4_convert_unilm.ipynb 53KB

t5_squad_span_selection.ipynb 52KB

trainer_with_gpt2.ipynb 49KB

squad.ipynb 45KB

squad_as_generation_gpt2_prefix.ipynb 40KB

squad_as_generation_gpt2.ipynb 39KB

mnli.ipynb 38KB

ner_albert.ipynb 38KB

squad_roberta.ipynb 37KB

pretrain.ipynb 36KB

mrpc.ipynb 35KB

qnli.ipynb 35KB

stsb.ipynb 35KB

rte.ipynb 35KB

wnli.ipynb 35KB

qqp.ipynb 35KB

sst2.ipynb 34KB

create_pretrain_data.ipynb 34KB

prepare_data_malayalam.ipynb 32KB

glue_evaluation.ipynb 27KB

t5_squad_as_generation.ipynb 26KB

4_convert_t5_from_huggingface.ipynb 24KB

bert_cola_text_classification.ipynb 24KB

6_convert_mt5_from_huggingface.ipynb 23KB

bert_tokenizer.ipynb 23KB

cola.ipynb 18KB

train_bert.ipynb 15KB

1_convert_bert_from_tfhub.ipynb 15KB

test_gpu_tpu.ipynb 13KB

basics.ipynb 13KB

3_convert_roberta_from_huggingface.ipynb 12KB

convert_gpt2.ipynb 9KB

convert_to_features.ipynb 8KB

5_convert_gpt2_from_huggingface.ipynb 7KB

2_convert_albert_from_huggingface.ipynb 7KB

squad evaluation.ipynb 5KB

1_glue_download.ipynb 4KB

eval_stsb.json 1KB

squad_albert_joint_loss.json 730B

eval_qqp.json 722B

eval_mrpc.json 711B

config.json 378B

eval_wnli.json 305B

eval_mnli_mismatched.json 302B

eval_sst2.json 302B

eval_mnli.json 301B

eval_rte.json 301B

eval_qnli.json 301B

eval_cola.json 272B

LICENSE 11KB

README.md 18KB

30k-clean.model 742KB

PKG-INFO 19KB

glue_benchmark.png 57KB

squad_benchmark.png 36KB

text_decoder_seq2seq_serializable.py 55KB

sentencepiece_model_pb2.py 48KB

text_decoder_serializable_encoder_only.py 46KB

text_decoder_encoder_only_serializable.py 46KB

bigbird_attention.py 46KB

albert.py 35KB

text_layer_experimental.py 32KB

roberta.py 31KB

bert.py 31KB

convert.py 30KB

gpt2.py 30KB

bart.py 30KB

text_decoder_seq2seq.py 30KB

text_decoder_encoder_only.py 29KB

trainer.py 28KB

text_decoder_model.py 27KB

legacy_compile.py 26KB

convert.py 24KB

trainer copy.py 24KB

convert.py 24KB

mt5.py 23KB

t5.py 23KB

tokenizer_t5.py 23KB

t5_attention.py 22KB

squad_utils_sp.py 22KB

tokenizer_albert.py 21KB

tokenizer_bigbird_roberta.py 21KB

convert.py 21KB

tfrecord_utils.py 21KB

setup.py 21KB

convert.py 21KB

adafactor_optimization.py 20KB

encoder_decoder.py 19KB

create_pretraining_process.py 18KB

tokenization.py 18KB

docstring_utils.py 18KB

共 246 条

<img src="src/logo2.png" width="400"/> <a href="https://github.com/legacyai/tf-transformers/actions?workflow=Tests"> <img alt="Tests" src="https://github.com/legacyai/tf-transformers/workflows/Tests/badge.svg"> </a> <a href="https://codecov.io/gh/legacyai/tf-transformers"> <img alt="Coverage" src="https://codecov.io/gh/legacyai/tf-transformers/branch/main/graph/badge.svg?token=9TZ10G9GL6"> </a> <a href="https://opensource.org/licenses/Apache-2.0"> <img alt="License" src="https://img.shields.io/badge/License-Apache%202.0-blue.svg"> </a> <h3 align="center"> tf-transformers: faster and easier state-of-the-art NLP in TensorFlow 2.0 </h3> tf-transformers is designed to harness the full power of Tensorflow 2, to make it much faster and simpler comparing to existing Tensorflow based NLP architectures. On an average, there is **80 %** improvement over current exsting Tensorflow based libraries, on text generation and other tasks. You can find more details in the Benchmarks section. All / Most NLP downstream tasks can be easily integrated into Tranformer based models with much ease. All the models can be trained using ```model.fit```, which supports **GPU**, **multi-GPU**, **TPU**. ## Version version: v1.0.8 ## Unique Features - **Faster AutoReggressive Decoding** using Tensorflow2. Faster than PyTorch in most experiments (V100 GPU). **80%** faster compared to existing TF based libararies (relative difference) Refer [benchmark code](tests/notebooks/benchmarks/). - Complete **TFlite** support for **BERT, RoBERTA, T5, Albert, mt5** for all down stream tasks except text-generation - **Faster sentence-piece alignment** (no more LCS overhead) - **Variable batch text generation** for Encoder only models like GPT2 - No more hassle of writing long codes for **TFRecords. minimal and simple**. - Off the shelf support for auto-batching **tf.data.dataset** or **tf.ragged** tensors - Pass dictionary outputs directly to loss functions inside ```tf.keras.Model.fit``` using **model.compile2** . Refer [examples](src/tf_transformers/notebooks/tutorials/) or [blog](https://legacyai-org.medium.com/tf-transformers-f7722536ba61) - Multiple mask modes like **causal**, **user-defined**, **prefix** by changing one argument . Refer [examples](src/tf_transformers/notebooks/tutorials/) or [blog](https://legacyai-org.medium.com/tf-transformers-f7722536ba61) ## Performance Benchmarks Evaluating performance benhcmarks is trickier. I evaluated **tf-transformers**, primarily on **text-generation** tasks with **GPT2 small** and **t5 small**, with amazing **HuggingFace**, as it is the ready to go library for NLP right now. Text generation tasks require efficient caching to make use of past **Key** and **Value** pairs. On an average, **tf-transformers** is **80 %** faster than **HuggingFace** **Tensorflow** implementation and in most cases it is **comparable** or **faster** than **PyTorch**. #### 1. GPT2 benchmark The evaluation is based on average of 5 runs, with different **batch_size**, **beams**, **sequence_length** etc. So, there is qute a larg combination, when it comes to **BEAM** and **top-k*8 decoding. The figures are **randomly taken 10 samples**. But, you can see the full code and figures in the repo. * GPT2 greedy <img src="tests/notebooks/benchmarks/gpt2/gpt2_greedy_sample.png" width="900"/> * GPT2 beam <img src="tests/notebooks/benchmarks/gpt2/gpt2_beam_sample.png" width="900"/> * GPT2 top-k top-p <img src="tests/notebooks/benchmarks/gpt2/gpt2_topk_top_p_sample.png" width="900"/> * GPT2 greedy histogram <img src="tests/notebooks/benchmarks/gpt2/greedy.png" width="500"/> [Codes to reproduce GPT2 benchmark experiments](tests/notebooks/benchmarks/gpt2) [Codes to reproduce T5 benchmark experiments](tests/notebooks/benchmarks/t5) ## QuickStart I am providing some basic tutorials here, which covers basics of tf-transformers and how can we use it for other downstream tasks. All/most tutorials has following structure: * Introduction About the Problem * Prepare Training Data * Load Model and asociated downstream Tasks * Define Optimizer, Loss * Train using Keras and CustomTrainer * Evaluate Using Dev data * In Producton - Secton defines how can we use ```tf.saved_model``` in production + pipelines ## Production Ready Tutorials Start by converting **HuggingFace** models (base models only) to **tf-transformers** models. Here are a few examples : Jupyter Notebooks: - [Basics of tf-transformers](src/tf_transformers/notebooks/tutorials/basics.ipynb) - [Convert HuggingFace Models ( BERT, Albert, Roberta, GPT2, t5, mt5) to tf-transformers checkpoints](src/tf_transformers/notebooks/conversion_scripts/) - [Name Entity Recognition + Albert + TFlite + Joint Loss + Pipeline](src/tf_transformers/notebooks/tutorials/ner_albert.ipynb) - [Squad v1.1 + Roberta + TFlite + Pipeline](src/tf_transformers/notebooks/tutorials/squad_roberta.ipynb) - [Roberta2Roberta Encoder Decoder + XSUM + Summarisation](src/tf_transformers/notebooks/tutorials/seq2seq_summarization.ipynb) - [Squad v1.1 + T5 + Text Generation](src/tf_transformers/notebooks/tutorials/t5_squad_as_generation.ipynb) - [Squad v1.1 + T5 + Span Selection + TFlite + Pipeline](src/tf_transformers/notebooks/tutorials/t5_squad_span_selection.ipynb) - [Albert + GLUE + Joint Loss - Glue Score 81.0 on 14 M parameter + 5 layers](src/tf_transformers/notebooks/tutorials/joint_loss_experiments) - [Albert + Squad + Joint Loss - EM/F1 78.1/87.0 on 14 M parameter + 5 layers](src/tf_transformers/notebooks/tutorials/joint_loss_experiments/squad.ipynb) - [Squad v1.1 + GPT2 + Causal Masking EM/F1 37.36/50.20] (Coming Soon) - [Squad v1.1 + GPT2 + Prefix Masking EM/F1 47.52/63.20](Coming Soon) - BERT + STS-B + Regression (Coming Soon) - [BERT + COLA + Text Classification + TFlite + Pipeline](src/tf_transformers/notebooks/tutorials/) ## Why should I use tf-transformers? 1. Use state-of-the-art models in Production, with less than 10 lines of code. - High performance models, better than all official Tensorflow based models - Very simple classes for all downstream tasks - Complete TFlite support for all tasks except text-generation 2. Make industry based experience to avaliable to students and community with clear tutorials 3. Train any model on **GPU**, **multi-GPU**, **TPU** with amazing ```tf.keras.Model.fit``` - Train state-of-the-art models in few lines of code. - All models are completely serializable. 4. Customize any models or pipelines with minimal or no code change. ## Do we really need to distill? Jont Loss is all we need. #### 1. GLUE We have conducted few experiments to squeeze the power of **Albert base** models ( concept is applicable to any models and in tf-transformers, it is out of the box.) The idea is minimize the loss for specified task in each layer of your model and check predictions at each layer. as per our experiments, we are able to get the best smaller model (thanks to **Albert**), and from **layer 4** onwards we beat all the smaller model in **GLUE** benchmark. By **layer 6**, we got a **GLUE** score of **81.0**, which is **4** points ahead of

评论收藏

内容反馈

版权申诉