# ð¦ð²ð¤ Alpaca-LoRA
- ð¤ **Try the pretrained model out [here](https://huggingface.co/spaces/tloen/alpaca-lora), courtesy of a GPU grant from Huggingface!**
- Users have created a Discord server for discussion and support [here](https://discord.gg/prbq284xX5)
This repository contains code for reproducing the [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) results using [low-rank adaptation (LoRA)](https://arxiv.org/pdf/2106.09685.pdf).
We provide an Instruct model of similar quality to `text-davinci-003` that can run [on a Raspberry Pi](https://twitter.com/miolini/status/1634982361757790209) (for research),
and the code is easily extended to the `13b`, `30b`, and `65b` models.
In addition to the training code, which runs within five hours on a single RTX 4090,
we publish a script for downloading and inference on the foundation model and LoRA,
as well as the resulting [LoRA weights themselves](https://huggingface.co/tloen/alpaca-lora-7b/tree/main).
To fine-tune cheaply and efficiently, we use Hugging Face's [PEFT](https://github.com/huggingface/peft)
as well as Tim Dettmers' [bitsandbytes](https://github.com/TimDettmers/bitsandbytes).
Without hyperparameter tuning, the LoRA model produces outputs comparable to the Stanford Alpaca model. (Please see the outputs included below.) Further tuning might be able to achieve better performance; I invite interested users to give it a try and report their results.
## Setup
1. Install dependencies
```bash
pip install -r requirements.txt
```
1. Set environment variables, or modify the files referencing `BASE_MODEL`:
```bash
# Files referencing `BASE_MODEL`
# export_hf_checkpoint.py
# export_state_dict_checkpoint.py
export BASE_MODEL=decapoda-research/llama-7b-hf
```
Both `finetune.py` and `generate.py` use `--base_model` flag as shown further below.
1. If bitsandbytes doesn't work, [install it from source.](https://github.com/TimDettmers/bitsandbytes/blob/main/compile_from_source.md) Windows users can follow [these instructions](https://github.com/tloen/alpaca-lora/issues/17).
### Training (`finetune.py`)
This file contains a straightforward application of PEFT to the LLaMA model,
as well as some code related to prompt construction and tokenization.
PRs adapting this code to support larger models are always welcome.
Example usage:
```bash
python finetune.py \
--base_model 'decapoda-research/llama-7b-hf' \
--data_path 'yahma/alpaca-cleaned' \
--output_dir './lora-alpaca'
```
We can also tweak our hyperparameters:
```bash
python finetune.py \
--base_model 'decapoda-research/llama-7b-hf' \
--data_path 'yahma/alpaca-cleaned' \
--output_dir './lora-alpaca' \
--batch_size 128 \
--micro_batch_size 4 \
--num_epochs 3 \
--learning_rate 1e-4 \
--cutoff_len 512 \
--val_set_size 2000 \
--lora_r 8 \
--lora_alpha 16 \
--lora_dropout 0.05 \
--lora_target_modules '[q_proj,v_proj]' \
--train_on_inputs \
--group_by_length
```
### Inference (`generate.py`)
This file reads the foundation model from the Hugging Face model hub and the LoRA weights from `tloen/alpaca-lora-7b`, and runs a Gradio interface for inference on a specified input. Users should treat this as example code for the use of the model, and modify it as needed.
Example usage:
```bash
python generate.py \
--load_8bit \
--base_model 'decapoda-research/llama-7b-hf' \
--lora_weights 'tloen/alpaca-lora-7b'
```
### Checkpoint export (`export_*_checkpoint.py`)
These files contain scripts that merge the LoRA weights back into the base model
for export to Hugging Face format and to PyTorch `state_dicts`.
They should help users
who want to run inference in projects like [llama.cpp](https://github.com/ggerganov/llama.cpp)
or [alpaca.cpp](https://github.com/antimatter15/alpaca.cpp).
### Notes
- We can likely improve our model performance significantly if we had a better dataset. Consider supporting the [LAION Open Assistant](https://open-assistant.io/) effort to produce a high-quality dataset for supervised fine-tuning (or bugging them to release their data).
- We're continually fixing bugs and conducting training runs, and the weights on the Hugging Face Hub are being updated accordingly. In particular, those facing issues with response lengths should make sure that they have the latest version of the weights and code.
- Users with multiple GPUs should take a look [here](https://github.com/tloen/alpaca-lora/issues/8#issuecomment-1477490259).
- We include the Stanford Alpaca dataset, which was made available under the ODC Attribution License.
### Resources
- [alpaca.cpp](https://github.com/antimatter15/alpaca.cpp), a native client for running Alpaca models on the CPU
- [Alpaca-LoRA-Serve](https://github.com/deep-diver/Alpaca-LoRA-Serve), a ChatGPT-style interface for Alpaca models
- [AlpacaDataCleaned](https://github.com/gururise/AlpacaDataCleaned), a project to improve the quality of the Alpaca dataset
- Various adapter weights (download at own risk):
- 7B:
- <https://huggingface.co/tloen/alpaca-lora-7b>
- <https://huggingface.co/samwit/alpaca7B-lora>
- ð¤ <https://huggingface.co/nomic-ai/gpt4all-lora>
- ð§ð· <https://huggingface.co/22h/cabrita-lora-v0-1>
- ð¨ð³ <https://huggingface.co/ziqingyang/chinese-alpaca-lora-7b>
- ð¨ð³ <https://huggingface.co/qychen/luotuo-lora-7b-0.1>
- ð¯ðµ <https://huggingface.co/kunishou/Japanese-Alapaca-LoRA-7b-v0>
- ð«ð· <https://huggingface.co/bofenghuang/vigogne-lora-7b>
- ð¹ð <https://huggingface.co/Thaweewat/thai-buffala-lora-7b-v0-1>
- ð©ðª <https://huggingface.co/thisserand/alpaca_lora_german>
- ð®ð¹ <https://huggingface.co/teelinsan/camoscio-7b-llama>
- 13B:
- <https://huggingface.co/chansung/alpaca-lora-13b>
- <https://huggingface.co/mattreid/alpaca-lora-13b>
- <https://huggingface.co/samwit/alpaca13B-lora>
- ð¯ðµ <https://huggingface.co/kunishou/Japanese-Alapaca-LoRA-13b-v0>
- ð°ð· <https://huggingface.co/chansung/koalpaca-lora-13b>
- ð¨ð³ <https://huggingface.co/facat/alpaca-lora-cn-13b>
- ðªð¸ <https://huggingface.co/plncmm/guanaco-lora-13b>
- 30B:
- <https://huggingface.co/baseten/alpaca-30b>
- <https://huggingface.co/chansung/alpaca-lora-30b>
- ð¯ðµ <https://huggingface.co/kunishou/Japanese-Alapaca-LoRA-30b-v0>
- [alpaca-native](https://huggingface.co/chavinlo/alpaca-native), a replication using the original Alpaca code
### Example outputs
**Instruction**: Tell me about alpacas.
**Alpaca-LoRA**: Alpacas are members of the camelid family and are native to the Andes Mountains of South America. They are known for their soft, luxurious fleece, which is used to make clothing, blankets, and other items. Alpacas are herbivores and graze on grasses and other plants. They are social animals and live in herds of up to 20 individuals.
**Stanford Alpaca**: Alpacas are small, fluffy animals related to camels and llamas. They are native to Peru and Bolivia, and were first domesticated around 5,000 years ago. They are kept mainly for their fine, soft fleece, which is used to make knitwear and other garments. Alpacas are herd animals and live in small family groups, led by an older male. They are highly social animals and can form strong bonds with their owners.
**text-davinci-003**: Alpacas are a domesticated species of South American camelid, similar to llamas. They are native to the Andes Mountains in South America and are kept as livestock for their soft, luxurious wool. Alpacas are social animals and live in herds of up to 20 individuals. They are typically kept in small herds of two to five animals, and are relatively easy to care for. Alpacas are herbivores and feed on grass, hay, and other vegetation. They are also known for their gentle and friendly nature, making them popular as pets.
---
**Instruction**: Tell me about