# ð¯AI00 RWKV Server
<p align='center'>
<image src="docs/public/logo.gif" />
</p>
<div align="center">
![license](https://shields.io/badge/license-MIT%2FApache--2.0-blue)
[![Rust Version](https://img.shields.io/badge/Rust-1.75.0+-blue)](https://releases.rs/docs/1.75.0)
![PRs welcome](https://img.shields.io/badge/PRs-Welcome-brightgreen)
<!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section -->
[![All Contributors](https://img.shields.io/badge/all_contributors-7-orange.svg?style=flat-square)](#contributors-)
<!-- ALL-CONTRIBUTORS-BADGE:END -->
[English](README.md) | [ä¸æ](README_zh.md)
---
<div align="left">
`AI00 RWKV Server` is an inference API server for the [`RWKV` language model](https://github.com/BlinkDL/ChatRWKV) based upon the [`web-rwkv`](https://github.com/cryscan/web-rwkv) inference engine.
It supports `VULKAN` parallel and concurrent batched inference and can run on all GPUs that support `VULKAN`. No need for Nvidia cards!!! AMD cards and even integrated graphics can be accelerated!!!
No need for bulky `pytorch`, `CUDA` and other runtime environments, it's compact and ready to use out of the box!
Compatible with OpenAI's ChatGPT API interface.
100% open source and commercially usable, under the MIT license.
If you are looking for a fast, efficient, and easy-to-use LLM API server, then `AI00 RWKV Server` is your best choice. It can be used for various tasks, including chatbots, text generation, translation, and Q&A.
Join the `AI00 RWKV Server` community now and experience the charm of AI!
QQ Group for communication: 30920262
### ð¥Features
* Based on the `RWKV` model, it has high performance and accuracy
* Supports `VULKAN` inference acceleration, you can enjoy GPU acceleration without the need for `CUDA`! Supports AMD cards, integrated graphics, and all GPUs that support `VULKAN`
* No need for bulky `pytorch`, `CUDA` and other runtime environments, it's compact and ready to use out of the box!
* Compatible with OpenAI's ChatGPT API interface
### âUsages
* Chatbots
* Text generation
* Translation
* Q&A
* Any other tasks that LLM can do
### ð»Other
* Based on the [web-rwkv](https://github.com/cryscan/web-rwkv) project
* Model download: [V5](https://huggingface.co/cgisky/AI00_RWKV_V5) or [V6](https://huggingface.co/cgisky/ai00_rwkv_x060)
## Installation, Compilation, and Usage
### ð¦Download Pre-built Executables
1. Directly download the latest version from [Release](https://github.com/cgisky1980/ai00_rwkv_server/releases)
2. After [downloading the model](#ð»other), place the model in the `assets/models/` path, for example, `assets/models/RWKV-x060-World-3B-v2-20240228-ctx4096.st`
3. Optionally modify [`assets/Config.toml`](./assets/Config.toml) for model configurations like model path, quantization layers, etc.
4. Run in the command line
```bash
$ ./ai00_rwkv_server
```
5. Open the browser and visit the WebUI [`https://localhost:65530`](https://localhost:65530)
### ð(Optional) Build from Source
1. [Install Rust](https://www.rust-lang.org/)
2. Clone this repository
```bash
$ git clone https://github.com/cgisky1980/ai00_rwkv_server.git
$ cd ai00_rwkv_server
```
3. After [downloading the model](#ð»other), place the model in the `assets/models/` path, for example, `assets/models/RWKV-x060-World-3B-v2-20240228-ctx4096.st`
4. Compile
```bash
$ cargo build --release
```
5. After compilation, run
```bash
$ cargo run --release
```
6. Open the browser and visit the WebUI [`https://localhost:65530`](https://localhost:65530)
### ðConvert the Model
It only supports Safetensors models with the `.st` extension now. Models saved with the `.pth` extension using torch need to be converted before use.
1. [Download the `.pth` model](https://huggingface.co/BlinkDL)
2. In the [Release](https://github.com/cgisky1980/ai00_rwkv_server/releases) you could find an executable called `converter`. Run
```bash
$ ./converter --input /path/to/model.pth
```
3. If you are building from source, run
```bash
$ cargo run --release --bin converter -- --input /path/to/model.pth
```
4. Just like the steps mentioned above, place the model in the `.st` model in the `assets/models/` path and modify the model path in [`assets/Config.toml`](./assets/Config.toml)
## ðSupported Arguments
* `--config`: Configure file path (default: `assets/Config.toml`)
* `--ip`: The IP address the server is bound to
* `--port`: Running port
## ðCurrently Available APIs
The API service starts at port 65530, and the data input and output format follow the Openai API specification.
* `/api/oai/v1/models`
* `/api/oai/models`
* `/api/oai/v1/chat/completions`
* `/api/oai/chat/completions`
* `/api/oai/v1/completions`
* `/api/oai/completions`
* `/api/oai/v1/embeddings`
* `/api/oai/embeddings`
## ðWebUI Screenshots
### Chat Feature
<image src="img/chat_en.gif" />
### Continuation Feature
<image src="img/continuation_en.gif" />
### Paper Writing Feature
<image src="img/paper_en.gif" />
## ðTODO List
* [x] Support for `text_completions` and `chat_completions`
* [x] Support for sse push
* [x] Add `embeddings`
* [x] Integrate basic front-end
* [x] Parallel inference via `batch serve`
* [x] Support for `int8` quantization
* [x] Support for `NF4` quantization
* [x] Support for `LoRA` model
* [ ] Hot loading and switching of `LoRA` model
## ð¥Join Us
We are always looking for people interested in helping us improve the project. If you are interested in any of the following, please join us!
* ðWriting code
* ð¬Providing feedback
* ðProposing ideas or needs
* ðTesting new features
* âTranslating documentation
* ð£Promoting the project
* ð
Anything else that would be helpful to us
No matter your skill level, we welcome you to join us. You can join us in the following ways:
* Join our Discord channel
* Join our QQ group
* Submit issues or pull requests on GitHub
* Leave feedback on our website
We can't wait to work with you to make this project better! We hope the project is helpful to you!
## Thank you to these awesome individuals who are insightful and outstanding for their support and selfless dedication to the project
<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
<!-- prettier-ignore-start -->
<!-- markdownlint-disable -->
<table>
<tbody>
<tr>
<td align="center" valign="top" width="14.28%"><a href="https://github.com/cgisky1980"><img src="https://avatars.githubusercontent.com/u/82481660?v=4?s=100" width="100px;" alt="顾çç"/><br /><sub><b>顾çç</b></sub></a><br /><a href="https://github.com/Ai00-X/ai00_server/commits?author=cgisky1980" title="Documentation">ð</a> <a href="https://github.com/Ai00-X/ai00_server/commits?author=cgisky1980" title="Code">ð»</a> <a href="#content-cgisky1980" title="Content">ð</a> <a href="#design-cgisky1980" title="Design">ð¨</a> <a href="#mentoring-cgisky1980" title="Mentoring">ð§âð«</a></td>
<td align="center" valign="top" width="14.28%"><a href="https://cryscan.github.io/profile"><img src="https://avatars.githubusercontent.com/u/16053640?v=4?s=100" width="100px;" alt="ç 究社交"/><br /><sub><b>ç 究社交</b></sub></a><br /><a href="https://github.com/Ai00-X/ai00_server/commits?author=cryscan" title="Code">ð»</a> <a href="#example-cryscan" title="Examples">ð¡</a> <a href="#ideas-cryscan" title="Ideas, Planning, & Feedback">ð¤</a> <a href="#maintenance-cryscan" title="Maintenance">ð§</a> <a href="https://github.com/Ai00-X/ai00_server/pulls?q=is%3Apr+reviewed-by%3Acryscan" title="Reviewed Pull Requests">ð</a> <a href="#platform-cryscan" title="Packaging/porting to new platform">ð¦</a></td>
<td align="center" valign="to
没有合适的资源?快使用搜索试试~ 我知道了~
AI00 RWKV Server一个基于RWKV模型的推理API服务器小巧身材开箱即用100% 开源可商用
共60个文件
rs:19个
md:11个
gif:8个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 140 浏览量
2024-04-08
15:12:38
上传
评论
收藏 9.89MB ZIP 举报
温馨提示
AI00 RWKV Server是一个基于RWKV模型的推理API服务器。 支持VULKAN推理加速,可以在所有支持VULKAN的GPU上运行。不用N卡!!!A卡甚至集成显卡都可加速!!! 无需臃肿的pytorch、CUDA等运行环境,小巧身材,开箱即用! 兼容OpenAI的ChatGPT API接口。 100% 开源可商用,采用MIT协议。AI00 Server是一个基于RWKV模型的推理API服务器。AI00 Server基于 WEB-RWKV推理引擎进行开发。支持Vulkan/Dx12/OpenGL作为推理后端,无需臃肿的pytorch、CUDA等运行环境,小巧身材,开箱即用!兼容OpenAI的ChatGPT API接口。100% 开源可商用,采用MIT协议。如果你是想要在自己的应用程序中内嵌一个LLM,且对用户的机器要求不那么苛刻(6GB以上GRAM的显卡), AI00 Server无疑是一个很好的选择。立即加入AI00 RWKV Server社区,体验AI的魅力!你必须(在构建时)下载模型并将其放置在assets/models中,如果你从源代码构建。 你可以从 Hugging
资源推荐
资源详情
资源评论
收起资源包目录
ai00_rwkv_server-main.zip (60个子文件)
ai00_rwkv_server-main
Cargo.toml 2KB
convert_safetensors.py 3KB
.github
workflows
rust.yml 1KB
release.yml 6KB
deploy.yml 2KB
.gitattributes 108B
assets
certs
key.pem 2KB
cert.pem 1KB
configs
Config.toml 2KB
tokenizer
rwkv_vocab_v20230424.txt 1.04MB
rwkv_vocab_v20230424.json 1.51MB
www
index.zip 132B
src
run.rs 30KB
main.rs 12KB
api
oai
chat.rs 8KB
info.rs 952B
mod.rs 1KB
embedding.rs 3KB
completion.rs 6KB
mod.rs 1KB
file.rs 7KB
model.rs 3KB
adapter.rs 485B
auth.rs 3KB
sampler
nucleus.rs 4KB
bnf.rs 2KB
mod.rs 600B
mirostat.rs 3KB
config.rs 5KB
bin
converter.rs 1KB
middleware.rs 20KB
LICENSE 1KB
flowchart.png 47KB
.all-contributorsrc 2KB
docs
api-examples.md 907B
markdown-examples.md 1KB
public
logo.gif 231KB
.vitepress
theme
style.css 4KB
index.ts 435B
config.mts 2KB
en
api-examples.md 907B
markdown-examples.md 1KB
index.md 606B
index.md 1KB
ai00.md 973B
img
continuation.gif 753KB
ai00.gif 15KB
chat_en.gif 3.43MB
chat.gif 4.99MB
paper.gif 1.12MB
paper_en.gif 882KB
continuation_en.gif 784KB
package.json 459B
package-lock.json 123KB
Cargo.lock 104KB
codebase.md 11KB
README_zh.md 9KB
.gitignore 593B
README.md 10KB
issues.md 80B
共 60 条
- 1
资源评论
传奇开心果编程
- 粉丝: 8457
- 资源: 335
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功