# LLaMa 7b in rust
This repo contains the popular [LLaMa 7b](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/)
language model, fully implemented in the rust programming language!
Uses [dfdx](https://github.com/coreylowman/dfdx) tensors and CUDA acceleration.
**This runs LLaMa directly in f16, meaning there is no hardware acceleration on CPU.** Using CUDA is heavily recommended.
Here is the 7b model running on an A10 GPU:
![](llama-7b-a10.gif)
# How To Run
## (Once) Setting up model weights
### Download model weights
1. Install git lfs. On ubuntu you can run `sudo apt install git-lfs`
2. Activate git lfs with `git lfs install`.
3. Run the following commands to download the model weights in pytorch format (~25 GB):
1. LLaMa 7b (~25 GB): `git clone https://huggingface.co/decapoda-research/llama-7b-hf`
2. LLaMa 13b (~75 GB): `git clone https://huggingface.co/decapoda-research/llama-13b-hf`
3. LLaMa 65b (~244 GB): `git clone https://huggingface.co/decapoda-research/llama-65b-hf`
### Convert the model
1. (Optional) Run `python3.x -m venv <my_env_name>` to create a python virtual environment, where `x` is your prefered python version
2. (Optional, requires 1.) Run `source <my_env_name>\bin\activate` (or `<my_env_name>\Scripts\activate` if on Windows) to activate the environment
3. Run `pip install numpy torch`
4. Run `python convert.py` to convert the model weights to rust understandable format:
a. LLaMa 7b: `python convert.py`
b. LLaMa 13b: `python convert.py llama-13b-hf`
c. LLaMa 65b: `python convert.py llama-65b-hf`
## (Once) Compile
You can compile with normal rust commands:
With cuda:
```bash
cargo build --release -F cuda
```
Without cuda:
```bash
cargo build --release
```
## Run the executable
With default args:
```bash
./target/release/llama-dfdx --model <model-dir> generate "<prompt>"
./target/release/llama-dfdx --model <model-dir> chat
./target/release/llama-dfdx --model <model-dir> file <path to prompt file>
```
To see what commands/custom args you can use:
```bash
./target/release/llama-dfdx --help
```
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
大模型部署_基于Rust+CUDA加速部署LLaMA-7b_附项目源码+流程教程_优质项目实战.zip (11个子文件)
大模型部署_基于Rust+CUDA加速部署LLaMA-7b_附项目源码+流程教程_优质项目实战
Cargo.toml 538B
llama-13b-a10.gif 81KB
llama-7b-a10.gif 247KB
src
sampling.rs 1KB
main.rs 8KB
lazy.rs 5KB
loading.rs 7KB
pipeline.rs 4KB
modeling.rs 11KB
convert.py 829B
README.md 2KB
共 11 条
- 1
资源评论
__AtYou__
- 粉丝: 3511
- 资源: 2175
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功