AAAI2023VisuallyGroundedCommonsenseKnowledgeAcquisition源码_知识图谱链路预测资源-CSDN文库

共221个文件

py：126个

md：17个

rst：16个

常识知识

知识图谱

计算机视觉

需积分: 10 168 浏览量 2022-12-01 15:26:26 上传评论收藏 42.1MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

AAAI2023Visually Grounded Commonsense Knowledge Acquisition 源码（221个子文件）

.coveragerc 215B

huggingface.css 4KB

code-snippets.css 257B

Dockerfile 194B

.gitignore 1KB

.gitignore 43B

.gitignore 6B

.gitmodules 205B

MANIFEST.in 16B

cocoEvalCapDemo.ipynb 204KB

Comparing-TF-and-PT-models-SQuAD.ipynb 203KB

Comparing-TF-and-PT-models-MLM-NSP.ipynb 169KB

Comparing-PT-and-TF-models.ipynb 90KB

Comparing-TF-and-PT-models.ipynb 61KB

spice-1.0.jar 18.84MB

meteor-1.5.jar 6.03MB

stanford-corenlp-3.4.1.jar 5.65MB

framework.jpg 423KB

custom.js 2KB

captions_val2014.json 28.33MB

caption_flickr30k.json 24.81MB

captions_val2014_fakecap_results.json 82KB

dev-v2.0-small.json 9KB

LICENSE 11KB

LICENSE 1KB

Makefile 585B

VinVL_MODEL_ZOO.md 22KB

README.md 19KB

MODEL_ZOO.md 15KB

README.md 6KB

VinVL_DOWNLOAD.md 5KB

quickstart.md 5KB

README.md 5KB

migration.md 4KB

README.md 3KB

SECURITY.md 3KB

README.md 2KB

DOWNLOAD.md 2KB

README.md 2KB

INSTALL.md 904B

INSTALL.md 713B

CODE_OF_CONDUCT.md 444B

DATASET.md 180B

test_sentencepiece.model 247KB

Calibre-Regular.otf 49KB

Calibre-Medium.otf 47KB

Calibre-Thin.otf 46KB

oscar.PNG 332KB

pretrain_corpus.PNG 171KB

oscar_logo.png 118KB

warmup_cosine_hard_restarts_schedule.png 22KB

warmup_cosine_warm_restarts_schedule.png 22KB

warmup_cosine_schedule.png 17KB

warmup_linear_schedule.png 16KB

warmup_constant_schedule.png 10KB

modeling_bert.py 66KB

modeling_xlnet.py 62KB

run_vqa.py 61KB

modeling_transfo_xl.py 57KB

run_gqa.py 55KB

modeling_bert.py 49KB

run_nlvr.py 48KB

run_captioning.py 45KB

modeling_xlm.py 44KB

modeling_utils.py 43KB

modeling_utils.py 42KB

utils_squad.py 41KB

cbs.py 39KB

oscar_tsv.py 36KB

modeling_gpt2.py 36KB

run_bag.py 35KB

run_instance_pred_cls.py 35KB

modeling_openai.py 34KB

run_retrieval.py 34KB

modeling_clever.py 31KB

finetune_dataset.py 30KB

run_squad.py 28KB

simple_lm_finetuning.py 27KB

modeling_common_test.py 25KB

run_glue.py 25KB

task_utils.py 24KB

run_swag.py 24KB

run_oscarplus_pretrain.py 24KB

tokenization_transfo_xl.py 21KB

utils_glue.py 21KB

tokenization_utils.py 20KB

tokenization_bert.py 19KB

run_bertology.py 18KB

bert_hubconf.py 17KB

pregenerate_training_data.py 16KB

finetune_on_pregenerated.py 15KB

coco.py 15KB

vanilla_ft.py 15KB

modeling_bert_test.py 14KB

run_openai_gpt.py 14KB

modeling_xlnet_test.py 14KB

prompt_ft.py 14KB

共 221 条

# ð¾ PyTorch-Transformers [![CircleCI](https://circleci.com/gh/huggingface/pytorch-transformers.svg?style=svg)](https://circleci.com/gh/huggingface/pytorch-transformers) PyTorch-Transformers (formerly known as `pytorch-pretrained-bert`) is a library of state-of-the-art pre-trained models for Natural Language Processing (NLP). The library currently contains PyTorch implementations, pre-trained model weights, usage scripts and conversion utilities for the following models: 1. **[BERT](https://github.com/google-research/bert)** (from Google) released with the paper [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805) by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. 2. **[GPT](https://github.com/openai/finetune-transformer-lm)** (from OpenAI) released with the paper [Improving Language Understanding by Generative Pre-Training](https://blog.openai.com/language-unsupervised/) by Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever. 3. **[GPT-2](https://blog.openai.com/better-language-models/)** (from OpenAI) released with the paper [Language Models are Unsupervised Multitask Learners](https://blog.openai.com/better-language-models/) by Alec Radford*, Jeffrey Wu*, Rewon Child, David Luan, Dario Amodei** and Ilya Sutskever**. 4. **[Transformer-XL](https://github.com/kimiyoung/transformer-xl)** (from Google/CMU) released with the paper [Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860) by Zihang Dai*, Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov. 5. **[XLNet](https://github.com/zihangdai/xlnet/)** (from Google/CMU) released with the paper [âXLNet: Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237) by Zhilin Yang*, Zihang Dai*, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le. 6. **[XLM](https://github.com/facebookresearch/XLM/)** (from Facebook) released together with the paper [Cross-lingual Language Model Pretraining](https://arxiv.org/abs/1901.07291) by Guillaume Lample and Alexis Conneau. These implementations have been tested on several datasets (see the example scripts) and should match the performances of the original implementations (e.g. ~93 F1 on SQuAD for BERT Whole-Word-Masking, ~88 F1 on RocStories for OpenAI GPT, ~18.3 perplexity on WikiText 103 for Transformer-XL, ~0.916 Peason R coefficient on STS-B for XLNet). You can find more details on the performances in the Examples section of the [documentation](https://huggingface.co/pytorch-transformers/examples.html). | Section | Description | |-|-| | [Installation](#installation) | How to install the package | | [Quick tour: Usage](#quick-tour-usage) | Tokenizers & models usage: Bert and GPT-2 | | [Quick tour: Fine-tuning/usage scripts](#quick-tour-of-the-fine-tuningusage-scripts) | Using provided scripts: GLUE, SQuAD and Text generation | | [Migrating from pytorch-pretrained-bert to pytorch-transformers](#Migrating-from-pytorch-pretrained-bert-to-pytorch-transformers) | Migrating your code from pytorch-pretrained-bert to pytorch-transformers | | [Documentation](https://huggingface.co/pytorch-transformers/) | Full API documentation and more | ## Installation This repo is tested on Python 2.7 and 3.5+ (examples are tested only on python 3.5+) and PyTorch 0.4.1 to 1.1.0 ### With pip PyTorch-Transformers can be installed by pip as follows: ```bash pip install pytorch-transformers ``` ### From source Clone the repository and run: ```bash pip install [--editable] . ``` ### Tests A series of tests is included for the library and the example scripts. Library tests can be found in the [tests folder](https://github.com/huggingface/pytorch-transformers/tree/master/pytorch_transformers/tests) and examples tests in the [examples folder](https://github.com/huggingface/pytorch-transformers/tree/master/examples). These tests can be run using `pytest` (install pytest if needed with `pip install pytest`). You can run the tests from the root of the cloned repository with the commands: ```bash python -m pytest -sv ./pytorch_transformers/tests/ python -m pytest -sv ./examples/ ``` ## Quick tour Let's do a very quick overview of PyTorch-Transformers. Detailed examples for each model architecture (Bert, GPT, GPT-2, Transformer-XL, XLNet and XLM) can be found in the [full documentation](https://huggingface.co/pytorch-transformers/). ```python import torch from pytorch_transformers import * # PyTorch-Transformers has a unified API # for 6 transformer architectures and 27 pretrained weights. # Model | Tokenizer | Pretrained weights shortcut MODELS = [(BertModel, BertTokenizer, 'bert-base-uncased'), (OpenAIGPTModel, OpenAIGPTTokenizer, 'openai-gpt'), (GPT2Model, GPT2Tokenizer, 'gpt2'), (TransfoXLModel, TransfoXLTokenizer, 'transfo-xl-wt103'), (XLNetModel, XLNetTokenizer, 'xlnet-base-cased'), (XLMModel, XLMTokenizer, 'xlm-mlm-enfr-1024')] # Let's encode some text in a sequence of hidden-states using each model: for model_class, tokenizer_class, pretrained_weights in MODELS: # Load pretrained model/tokenizer tokenizer = tokenizer_class.from_pretrained(pretrained_weights) model = model_class.from_pretrained(pretrained_weights) # Encode text input_ids = torch.tensor([tokenizer.encode("Here is some text to encode")]) with torch.no_grad(): last_hidden_states = model(input_ids)[0] # Models outputs are now tuples # Each architecture is provided with several class for fine-tuning on down-stream tasks, e.g. BERT_MODEL_CLASSES = [BertModel, BertForPreTraining, BertForMaskedLM, BertForNextSentencePrediction, BertForSequenceClassification, BertForMultipleChoice, BertForTokenClassification, BertForQuestionAnswering] # All the classes for an architecture can be initiated from pretrained weights for this architecture # Note that additional weights added for fine-tuning are only initialized # and need to be trained on the down-stream task tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') for model_class in BERT_MODEL_CLASSES: # Load pretrained model/tokenizer model = model_class.from_pretrained('bert-base-uncased') # Models can return full list of hidden-states & attentions weights at each layer model = model_class.from_pretrained(pretrained_weights, output_hidden_states=True, output_attentions=True) input_ids = torch.tensor([tokenizer.encode("Let's see all hidden-states and attentions on this text")]) all_hidden_states, all_attentions = model(input_ids)[-2:] # Models are compatible with Torchscript model = model_class.from_pretrained(pretrained_weights, torchscript=True) traced_model = torch.jit.trace(model, (input_ids,)) # Simple serialization for models and tokenizers model.save_pretrained('./directory/to/save/') # save model = model_class.from_pretrained('./directory/to/save/') # re-load tokenizer.save_pretrained('./directory/to/save/') # save tokenizer = tokenizer_class.from_pretrained(pretrained_weights) # SOTA examples for GLUE, SQUAD, text generation... ``` ## Quick tour of the fine-tuning/usage scripts The library comprises several example scripts with SOTA performances for NLU and NLG tasks: - `run_glue.py`: an example fine-tuning Bert, XLNet and XLM on nine different GLUE tasks (*sequence-level classification*) - `run_squad.py`: an example fine-tuning Bert, XLNet and XLM on the question answering dataset SQuAD 2.0 (*token-level classification*) - `run_generation.py`: an example using GPT, GPT-2, Transformer-XL and XLNet for conditional language generation - other model-specific examples (see the documentation). Here are three quick usage examples for these s

评论收藏

内容反馈