# QLoRA: Efficient Finetuning of Quantized LLMs
## Demo
Guanaco is a system purely intended for research purposes and could produce problematic outputs.
1. Access the [live demo here](https://huggingface.co/spaces/uwnlp/guanaco-playground-tgi). Note this is the 33B model, the 65B model demo will come later.
2. Or host your own Guanaco gradio demo directly in Colab with [this notebook](https://colab.research.google.com/drive/17XEqL1JcmVWjHkT-WczdYkJlNINacwG7?usp=sharing). Works with free GPUs for 7B and 13B models.
3. Alternatively, can you distinguish ChatGPT from Guanaco? Give it a try!
You can access [the model response Colab here](https://colab.research.google.com/drive/1kK6xasHiav9nhiRUJjPMZb4fAED4qRHb?usp=sharing) comparing ChatGPT and Guanaco 65B on Vicuna prompts.
## Installation
To load models in 4bits with transformers and bitsandbytes, you have to install accelerate and transformers from source and make sure you have the latest version of the bitsandbytes library. After installing PyTorch (follow instructions [here](https://pytorch.org/get-started/locally/)), you can achieve the above with the following command:
```bash
pip install -U -r requirements.txt
```
## Getting Started
The `qlora.py` code is a starting point for finetuning and inference on various datasets.
Basic command for finetuning a baseline model on the Alpaca dataset:
```bash
python qlora.py --model_name_or_path <path_or_name>
```
For models larger than 13B, we recommend adjusting the learning rate:
```bash
python qlora.py –learning_rate 0.0001 --model_name_or_path <path_or_name>
```
To replicate our Guanaco models see below.
### Tutorials and Demonstrations
Here is [a blog](https://huggingface.co/blog/4bit-transformers-bitsandbytes) discussing 4-bit quantization, QLoRA, and how they are integrated in transformers.
You can host your own gradio Guanaco demo directly in Colab following [this notebook](https://colab.research.google.com/drive/17XEqL1JcmVWjHkT-WczdYkJlNINacwG7?usp=sharing).
In addition, here are Colab notebooks with examples for inference and finetuning using QLoRA:
- [Inference notebook](https://colab.research.google.com/drive/1ge2F1QSK8Q7h0hn3YKuBCOAS0bK8E0wf?usp=sharing)
- [Finetuning notebook](https://colab.research.google.com/drive/1VoYNfYDKcKRQRor98Zbf2-9VQTtGJ24k?usp=sharing)
Other examples are found under the `examples/` folder. We include a generation getting started example with guanaco at `examples/guanaco_generate.py`.
### Quantization
Quantization parameters are controlled from the `BitsandbytesConfig` ([see HF documenation](https://huggingface.co/docs/transformers/main_classes/quantization#transformers.BitsAndBytesConfig)) as follows:
- Loading in 4 bits is activated through `load_in_4bit`
- The datatype used for the linear layer computations with `bnb_4bit_compute_dtype`
- Nested quantization is activated through `bnb_4bit_use_double_quant`
- The datatype used for qunatization is specified with `bnb_4bit_quant_type`. Note that there are two supported quantization datatypes `fp4` (four bit float) and `nf4` (normal four bit float). The latter is theoretically optimal for normally distributed weights and we recommend using `nf4`.
```python
model = AutoModelForCausalLM.from_pretrained(
model_name_or_path='/name/or/path/to/your/model',
load_in_4bit=True,
device_map='auto',
max_memory=max_memory,
torch_dtype=torch.bfloat16,
quantization_config=BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type='nf4'
),
)
```
### Paged Optimizer
You can access the paged optimizer with the argument `--optim paged_adamw_32bit`
### Guanaco Finetuning
You can select `--dataset oasst1` to load the OpenAssistant dataset that was used to train Guanaco. You can also find it on HF at [timdettmers/openassistant-guanaco](https://huggingface.co/datasets/timdettmers/openassistant-guanaco).
We include scripts to reproduce the hyperparameters of Guanaco model training for various sizes at `./scripts/finetune_guanaco*.sh`. Make sure to adjust `per_device_train_batch_size` and `gradient_accumulation_steps` so that their product is 16 and training fits on your GPUs.
### Using Local Datasets
You can specify the path to your dataset using the `--dataset` argument. If the `--dataset_format` argument is not set, it will default to the Alpaca format. Here are a few examples:
- Training with an *alpaca* format dataset:
```bash
python qlora.py --dataset="path/to/your/dataset"
```
- Training with a *self-instruct* format dataset:
```bash
python qlora.py --dataset="path/to/your/dataset" --dataset_format="self-instruct"
```
### Multi GPU
Multi GPU training and inference work out-of-the-box with Hugging Face's Accelerate. Note that the `per_device_train_batch_size` and `per_device_eval_batch_size` arguments are global batch sizes unlike what their name suggest.
When loading a model for training or inference on multiple GPUs you should pass something like the following to `AutoModelForCausalLM.from_pretrained()`:
```python
device_map = "auto"
max_memory = {i: '46000MB' for i in range(torch.cuda.device_count())}
```
## Sample Outputs
We provide generations for the models described in the paper for both OA and Vicuna queries in the `eval/generations` folder. These are intended to foster further research on model evaluation and analysis.
Can you distinguish ChatGPT from Guanaco? Give it a try!
You can access [the model response Colab here](https://colab.research.google.com/drive/1kK6xasHiav9nhiRUJjPMZb4fAED4qRHb?usp=sharing) comparing ChatGPT and Guanaco 65B on Vicuna prompts.
## Evaluation
We include scripts adapted from the FastChat repo to automatically evaluate model generations using GPT-4. We include script for comparisons relative to ChatGPT with scores out of 10 as well as "pairwise comparisons" with three class labeling (win, loose, or tie). These are found in the `eval` folder.
To facilitate the replication of our evaluation and future work in this area, we release GPT-4 and human ratings of our systems. These are found under `eval/ratings-human` and `eval/ratings-gpt4`.
More details can be found at `eval/EVAL_README.md`.
没有合适的资源?快使用搜索试试~ 我知道了~
大语言模型量化-对LLMs进行量化以进行搞笑Finetuning微调-附项目源码-优质项目分享.zip
共272个文件
jsonl:249个
sh:7个
py:4个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 47 浏览量
2024-05-11
11:14:20
上传
评论
收藏 50.93MB ZIP 举报
温馨提示
大语言模型量化_对LLMs进行量化以进行搞笑Finetuning微调_附项目源码_优质项目分享
资源推荐
资源详情
资源评论
收起资源包目录
大语言模型量化-对LLMs进行量化以进行搞笑Finetuning微调-附项目源码-优质项目分享.zip (272个子文件)
vicuna_benchmark_human_annotations.csv 21.02MB
mturk_ui.html 4KB
generations_qualitative_comparison_guanaco65b_vs_gpt35.ipynb 327KB
guanaco_7B_demo_colab.ipynb 15KB
five_shot_mmlu_test.json 40.41MB
zero_shot_mmlu_test.json 8.35MB
five_shot_mmlu_val.json 4.43MB
zero_shot_mmlu_val.json 936KB
7b-hh-rlhf-oa-generations-topp0.9-temp0.7.jsonl 8.07MB
13b-hh-rlhf-oa-generations-topp0.9-temp0.7.jsonl 7.85MB
30b-hh-rlhf-oa-generations-topp0.9-temp0.7.jsonl 7.75MB
7b-longform-oa-generations-topp0.9-temp0.7.jsonl 6.69MB
13b-longform-oa-generations-topp0.9-temp0.7.jsonl 6.65MB
13b-guanaco-oa-generations-topp0.9-temp0.7.jsonl 6.19MB
7b-guanaco-oa-generations-topp0.9-temp0.7.jsonl 6.12MB
65b-hh-rlhf-oa-generations-topp0.9-temp0.7.jsonl 5.95MB
65b-guanaco-oa-generations-topp0.9-temp0.7.jsonl 5.91MB
30b-guanaco-oa-generations-topp0.9-temp0.7.jsonl 5.87MB
7b-self-instruct-oa-generations-topp0.9-temp0.7.jsonl 5.38MB
vicuna-13b-oa-generations.jsonl 5.31MB
65b-self-instruct-oa-generations-topp0.9-temp0.7.jsonl 5.11MB
13b-self-instruct-oa-generations-topp0.9-temp0.7.jsonl 4.93MB
7b-alpaca-oa-generations-topp0.9-temp0.7.jsonl 4.9MB
gpt-4-oa-generations.jsonl 4.73MB
65b-longform-oa-generations-topp0.9-temp0.7.jsonl 4.7MB
7b-unnatural-instructions-oa-generations-topp0.9-temp0.7.jsonl 4.65MB
7b-chip2-oa-generations-topp0.9-temp0.7.jsonl 4.65MB
30b-self-instruct-oa-generations-topp0.9-temp0.7.jsonl 4.57MB
30b-longform-oa-generations-topp0.9-temp0.7.jsonl 4.32MB
30b-unnatural-instructions-oa-generations-topp0.9-temp0.7.jsonl 4.25MB
65b-alpaca-oa-generations-topp0.9-temp0.7.jsonl 4.24MB
13b-alpaca-oa-generations-topp0.9-temp0.7.jsonl 4.2MB
13b-unnatural-instructions-oa-generations-topp0.9-temp0.7.jsonl 4.19MB
13b-chip2-oa-generations-topp0.9-temp0.7.jsonl 4.19MB
30b-alpaca-oa-generations-topp0.9-temp0.7.jsonl 4.16MB
65b-chip2-oa-generations-topp0.9-temp0.7.jsonl 4.12MB
gpt-3.5-oa-generations.jsonl 3.99MB
65b-unnatural-instructions-oa-generations-topp0.9-temp0.7.jsonl 3.97MB
13b-flan-oa-generations-topp0.9-temp0.7.jsonl 3.95MB
30b-chip2-oa-generations-topp0.9-temp0.7.jsonl 3.89MB
oa_questions.jsonl 3.87MB
65b-flan-oa-generations-topp0.9-temp0.7.jsonl 3.68MB
7b-flan-oa-generations-topp0.9-temp0.7.jsonl 3.57MB
30b-flan-oa-generations-topp0.9-temp0.7.jsonl 3.54MB
7b-guanaco-oa-generations-topp0.9-temp0.7-vs-13b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1.03MB
30b-guanaco-oa-generations-topp0.9-temp0.7-vs-gpt-4-oa-generations-gpt-4-reviewer-threeclass.jsonl 1.03MB
vicuna-13b-oa-generations-vs-13b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1.02MB
gpt-3.5-oa-generations-vs-13b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1.02MB
gpt-4-oa-generations-vs-13b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1.02MB
65b-guanaco-oa-generations-topp0.9-temp0.7-vs-13b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1.02MB
65b-guanaco-oa-generations-topp0.9-temp0.7-vs-gpt-4-oa-generations-gpt-4-reviewer-threeclass.jsonl 1.02MB
13b-guanaco-oa-generations-topp0.9-temp0.7-vs-gpt-4-oa-generations-gpt-4-reviewer-threeclass.jsonl 1.02MB
30b-guanaco-oa-generations-topp0.9-temp0.7-vs-13b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1.01MB
30b-guanaco-oa-generations-topp0.9-temp0.7-vs-7b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1.01MB
gpt-4-oa-generations-vs-7b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1.01MB
vicuna-13b-oa-generations-vs-7b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1.01MB
gpt-4-oa-generations-vs-65b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1.01MB
65b-guanaco-oa-generations-topp0.9-temp0.7-vs-7b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1.01MB
7b-guanaco-oa-generations-topp0.9-temp0.7-vs-gpt-4-oa-generations-gpt-4-reviewer-threeclass.jsonl 1.01MB
gpt-3.5-oa-generations-vs-7b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1.01MB
7b-guanaco-oa-generations-topp0.9-temp0.7-vs-30b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1.01MB
13b-guanaco-oa-generations-topp0.9-temp0.7-vs-7b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1.01MB
7b-guanaco-oa-generations-topp0.9-temp0.7-vs-65b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1MB
13b-guanaco-oa-generations-topp0.9-temp0.7-vs-vicuna-13b-oa-generations-gpt-4-reviewer-threeclass.jsonl 1MB
vicuna-13b-oa-generations-vs-65b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1MB
vicuna-13b-oa-generations-vs-30b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1MB
gpt-4-oa-generations-vs-30b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1022KB
65b-guanaco-oa-generations-topp0.9-temp0.7-vs-30b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1020KB
gpt-3.5-oa-generations-vs-30b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1020KB
7b-guanaco-oa-generations-topp0.9-temp0.7-vs-vicuna-13b-oa-generations-gpt-4-reviewer-threeclass.jsonl 1019KB
13b-guanaco-oa-generations-topp0.9-temp0.7-vs-30b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1018KB
13b-guanaco-oa-generations-topp0.9-temp0.7-vs-65b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1018KB
gpt-3.5-oa-generations-vs-vicuna-13b-oa-generations-gpt-4-reviewer-threeclass.jsonl 1018KB
vicuna-13b-oa-generations-vs-gpt-4-oa-generations-gpt-4-reviewer-threeclass.jsonl 1018KB
65b-guanaco-oa-generations-topp0.9-temp0.7-vs-vicuna-13b-oa-generations-gpt-4-reviewer-threeclass.jsonl 1017KB
30b-guanaco-oa-generations-topp0.9-temp0.7-vs-65b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1017KB
gpt-4-oa-generations-vs-vicuna-13b-oa-generations-gpt-4-reviewer-threeclass.jsonl 1014KB
30b-guanaco-oa-generations-topp0.9-temp0.7-vs-vicuna-13b-oa-generations-gpt-4-reviewer-threeclass.jsonl 1014KB
gpt-3.5-oa-generations-vs-65b-guanaco-oa-generations-topp0.9-temp0.7-gpt-4-reviewer-threeclass.jsonl 1011KB
13b-guanaco-oa-generations-topp0.9-temp0.7-vs-gpt-3.5-oa-generations-gpt-4-reviewer-threeclass.jsonl 1010KB
30b-guanaco-oa-generations-topp0.9-temp0.7-vs-gpt-3.5-oa-generations-gpt-4-reviewer-threeclass.jsonl 1008KB
gpt-3.5-oa-generations-vs-gpt-4-oa-generations-gpt-4-reviewer-threeclass.jsonl 1004KB
vicuna-13b-oa-generations-vs-gpt-3.5-oa-generations-gpt-4-reviewer-threeclass.jsonl 1004KB
65b-guanaco-oa-generations-topp0.9-temp0.7-vs-gpt-3.5-oa-generations-gpt-4-reviewer-threeclass.jsonl 1003KB
7b-guanaco-oa-generations-topp0.9-temp0.7-vs-gpt-3.5-oa-generations-gpt-4-reviewer-threeclass.jsonl 1003KB
gpt-4-oa-generations-vs-gpt-3.5-oa-generations-gpt-4-reviewer-threeclass.jsonl 996KB
65b-guanaco-vicuna-generations-topp0.9-temp0.7.jsonl 290KB
7b-guanaco-vicuna-generations-topp0.9-temp0.7.jsonl 288KB
30b-guanaco-vicuna-generations-topp0.9-temp0.7.jsonl 282KB
13b-guanaco-vicuna-generations-topp0.9-temp0.7.jsonl 262KB
30b-hh-rlhf-vicuna-generations-topp0.9-temp0.7.jsonl 257KB
7b-hh-rlhf-vicuna-generations-topp0.9-temp0.7.jsonl 254KB
13b-hh-rlhf-vicuna-generations-topp0.9-temp0.7.jsonl 253KB
65b-longform-vicuna-generations-topp0.9-temp0.7.jsonl 213KB
65b-hh-rlhf-vicuna-generations-topp0.9-temp0.7.jsonl 205KB
13b-longform-vicuna-generations-topp0.9-temp0.7.jsonl 180KB
7b-longform-vicuna-generations-topp0.9-temp0.7.jsonl 177KB
answer_gpt4.jsonl 174KB
7b-alpaca-vicuna-generations-topp0.9-temp0.7.jsonl 153KB
30b-longform-vicuna-generations-topp0.9-temp0.7.jsonl 143KB
共 272 条
- 1
- 2
- 3
资源评论
__AtYou__
- 粉丝: 1587
- 资源: 441
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- vgg模型-图像分类算法对水果识别-不含数据集图片-含逐行注释和说明文档.zip
- KMP算法(Knuth-Morris-Pratt算法
- vgg模型-python语言pytorch框架训练识别化妆品分类-不含数据集图片-含逐行注释和说明文档.zip
- KMP算法(Knuth-Morris-Pratt算法
- shufflenet模型-基于人工智能的卷积网络训练识别狗的表情-不含数据集图片-含逐行注释和说明文档.zip
- shufflenet模型-python语言pytorch框架训练识别张嘴闭嘴-不含数据集图片-含逐行注释和说明文档.zip
- resnet模型-基于人工智能的卷积网络训练识别面部表情识别-不含数据集图片-含逐行注释和说明文档
- resnet模型-python语言pytorch框架训练识别香蕉品质-不含数据集图片-含逐行注释和说明文档.zip
- KMP算法(Knuth-Morris-Pratt算法
- mobilenet模型-python训练识别塑料制品分类-不含数据集图片-含逐行注释和说明文档.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功