# llama2j
This is a pure Java implementation of standalone LLama 2 inference, without any dependencies.
The project currently is intended for research use.
In addition, we implement CUDA version, where the transformer is implemented
as a number of CUDA kernels. Java code runs the kernels on GPU using JCuda.
The purpose of this project is to provide good-performance inference for LLama 2 models
that can run anywhere, and integrate easily with Java code. We desire to enable the LLM
locally available for backend code. LLM becomes a seamless and integrated part
of application backend functionality, and can be scaled together with the backend.
Features:
- 4 second startup time for LLama 7B model
- CPU support
- Single or multiple Nvidia GPU support
- I8 quantization of weights on the fly
- Caching of I8 weights
- Activations are FP32 (this is W8A32 quantization)
- CPU and CUDA implementations are identical and validatable against each other
Tested on:
- Ubuntu 22.04.02 and 22.04.03
- Windows 11 Version 10.0.22621 Build 22621
- LLama 7B model and smaller models
- Intel and AMD CPUs
- Java 20
- CUDA 11.2
- JCuda 11.2.0
- 1-4x RTX 4090
# Performance
Tokens per second is printed out at the end of the run, and it excludes model checkpoint
loading time.
NOTE: llama2.c has been compiled as 'make runomp' for the fastest performance.
| Command | Configuration 1 | Configuration 2 | Configuration 3 |
|------------------------------------------------------|-----------------|---------------|----------------|
| llama2j --mode CPU --checkpoint Llama-2-7b-chat.bin | 6.6 tok/s | 4.0 tok/s | 1.8 tok/s |
| llama2j --mode CUDA --checkpoint Llama-2-7b-chat.bin | 20.9 tok/s | 21.0 tok/s | 17.0 tok/s |
| llama2.c (OMP_NUM_THREADS=32) | 12.0 tok/s | 2.3 tok/s | - |
| llama2.c (OMP_NUM_THREADS=64) | 9.5 tok/s | 2.2 tok/s | - |
The duration of a model checkpoint loading depends on if the model is being loaded for the first
time, or if it already has been processed and cached. The time includes allocating memory, loading weighs
from the disk, if necessary, quantifying the weights, and transferring the data to GPU devices.
| Command | Configuration 1 | Configuration 2 | Configuration 3 |
|--------------------------------------------------------------------------|-----------------|-----------------|-----------------|
| Load Llama-2-7b-chat for the first time, quantize, and store quant files | 15.3 s | 27.4 s | 38.0 s |
| Load Llama-2-7b-chat from cached quant files | 0.8 s | 1.3 s | 1.7 s |
The test system configurations are:
| Configuration | System |
|-----------------|---------------------------------------------------------------------------------------------------------------------|
| Configuration 1 | Ubuntu 22.04.3, MZ33-AR0-000, AMD EPYC 9374F 32-core processor, (1 of 4) * Nvidia 4090, 368GB 4800 DDR5 |
| Configuration 2 | Ubuntu 22.04.3, ROG CROSSHAIR X670E EXTREME, AMD 9750x 16-core processor, 1 * Nvidia 4090, 64GB 4800 DDR5 |
| Configuration 3 | Windows 11 Pro Build 22621, ROG MAXIMUS Z790 APEX, Intel 13900KS 24-core processor, 1 * Nvidia 4090, 32BG 7600 DDR5 |
# Quick and Easy Installation
For Ubuntu, follow these instructions. For Windows 11, see the section below.
## Install dependencies
This also provides dependencies for using llama2.c for converting models to a llama2.c format
that llama2j can use.
```console
# install dependencies needed to also run llama2.c and
# to process models from hugging face and to set system performance configuration
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install build-essential git cmake python3 clang libomp-dev git-lfs python3-pip maven tuned \
linux-tools-6.2.0-26-generic linux-cloud-tools-6.2.0-26-generic \
linux-tools-6.2.0-31-generic linux-cloud-tools-6.2.0-31-generic \
linux-tools-generic \
linux-cloud-tools-generic \
-y
git config --global credential.helper store
pip install --upgrade huggingface_hub
pip install transformers
# Install Java 20
wget https://download.oracle.com/java/20/latest/jdk-20_linux-x64_bin.deb
sudo dpkg -i jdk-20_linux-x64_bin.deb
# make sure JDK 20 java is first in your path
export PATH=/usr/lib/jvm/jdk-20/bin/:$PATH
# add the same path setting also ~/.bashrc if you prefer
# check you have the correct Java, e.g java 20.0.2 or later
java --version
```
## Set up CUDA
First check that you have NVIDIA drivers installed. If not, download and install them from nvidia site. And Good luck!
```console
nvidia-smi
```
should show a Driver Version that is >= 525.00
```console
Thu Aug 31 13:04:25 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05 Driver Version: 535.104.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 Off | 00000000:01:00.0 On | Off |
| 0% 43C P8 28W / 450W | 1294MiB / 24564MiB | 2% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
```
Check CUDA version carefully. If it is anything else than 12.0
(for example 12.2 is not compatible), install CUDA 12.0 following
the exact instructions below. This will help you not to break any
other dependencies you might have to your current drivers.
NOTE: do not replace the drivers (unless you want to).
- Uncheck 'install drivers'!
- Accept other defaults & install
```console
wget https://developer.download.nvidia.com/compute/cuda/12.0.0/local_installers/cuda_12.0.0_525.60.13_linux.run
sudo sh cuda_12.0.0_525.60.13_linux.run
```
Installer will complain "***WARNING: Incomplete installation!" which
is not an error condition. You have your drivers and are good to go.
## Download and build
On Ubuntu, now everything is ready for cloning and building the project.
```console
mvn clean package
```
On Windows, just download the project
## Windows 11 quick installation guide
- Install git from https://git-scm.com/download/win
- Install maven from https://maven.apache.org/download.cgi
- Open PowerShell Prompt
- Go to the 'model' subdirectory under 'llama2j' directory, for example
```console
cd .\IdeaProjects\llama2j\models
```
Download a test model file
```console
curl https://huggingface.co/karpathy/tinyllamas/resolve/main/stories15M.bin -OutFile stories15M.bin
```
CUDA toolkit
- Install Anaconda on Windows
- Open 'Anaconda Prompt (miniconda3)' from the start menu
- Install CUDA 12.0 on conda
```console
conda install cuda -c nvidia/label/cuda-12.0.0
```
Microsoft C++ compiler
- Install Microsoft Visual Studio or Build Tools.
- Add Microsoft C++ compiler "cl.exe" into your PATH variable. Location varies, but on the test computer
it is:
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.36.32532\bin\Hostx64\x64
Now, everything should be ready to build on Windows.
```console
mvn clean
mvn package
```
That's all for Windows. For any further steps, follow the instructions below.
NOTE: Model conversion is only supported under Ubuntu.
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![docx](https://img-home.csdnimg.cn/images/20210720083331.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![txt](https://img-home.csdnimg.cn/images/20210720083642.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![rar](https://img-home.csdnimg.cn/images/20210720083606.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
收起资源包目录
![package](https://csdnimg.cn/release/downloadcmsfe/public/img/package.f3fc750b.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/TXT.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/TXT.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![folder](https://csdnimg.cn/release/downloadcmsfe/public/img/folder.005fa2e5.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/TXT.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
![file-type](https://csdnimg.cn/release/download/static_files/pc/images/minetype/UNKNOWN.png)
共 64 条
- 1
资源评论
![avatar-default](https://csdnimg.cn/release/downloadcmsfe/public/img/lazyLogo2.1882d7f4.png)
![avatar](https://profile-avatar.csdnimg.cn/3c54a849f48044a884c4cf76b8fda72a_weixin_66442839.jpg!1)
__AtYou__
- 粉丝: 1895
- 资源: 648
![benefits](https://csdnimg.cn/release/downloadcmsfe/public/img/vip-rights-1.c8e153b4.png)
下载权益
![privilege](https://csdnimg.cn/release/downloadcmsfe/public/img/vip-rights-2.ec46750a.png)
C知道特权
![article](https://csdnimg.cn/release/downloadcmsfe/public/img/vip-rights-3.fc5e5fb6.png)
VIP文章
![course-privilege](https://csdnimg.cn/release/downloadcmsfe/public/img/vip-rights-4.320a6894.png)
课程特权
![rights](https://csdnimg.cn/release/downloadcmsfe/public/img/vip-rights-icon.fe0226a8.png)
开通VIP
上传资源 快速赚钱
我的内容管理 展开
我的资源 快来上传第一个资源
我的收益
登录查看自己的收益我的积分 登录查看自己的积分
我的C币 登录后查看C币余额
我的收藏
我的下载
下载帮助
![voice](https://csdnimg.cn/release/downloadcmsfe/public/img/voice.245cc511.png)
![center-task](https://csdnimg.cn/release/downloadcmsfe/public/img/center-task.c2eda91a.png)
安全验证
文档复制为VIP权益,开通VIP直接复制
![dialog-icon](https://csdnimg.cn/release/downloadcmsfe/public/img/green-success.6a4acb44.png)