AI00RWKVServer一个基于RWKV模型的推理API服务器小巧身材开箱即用100%开源可商用

共60个文件

rs：19个

md：11个

gif：8个

版权申诉

140 浏览量 2024-04-08 15:12:38 上传评论收藏 9.89MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

ai00_rwkv_server-main.zip （60个子文件）

ai00_rwkv_server-main

Cargo.toml 2KB

convert_safetensors.py 3KB

.github

workflows

rust.yml 1KB

release.yml 6KB

deploy.yml 2KB

.gitattributes 108B

assets

certs

key.pem 2KB

cert.pem 1KB

configs

Config.toml 2KB

tokenizer

rwkv_vocab_v20230424.txt 1.04MB

rwkv_vocab_v20230424.json 1.51MB

www

index.zip 132B

src

run.rs 30KB

main.rs 12KB

api

oai

chat.rs 8KB

info.rs 952B

mod.rs 1KB

embedding.rs 3KB

completion.rs 6KB

mod.rs 1KB

file.rs 7KB

model.rs 3KB

adapter.rs 485B

auth.rs 3KB

sampler

nucleus.rs 4KB

bnf.rs 2KB

mod.rs 600B

mirostat.rs 3KB

config.rs 5KB

bin

converter.rs 1KB

middleware.rs 20KB

LICENSE 1KB

flowchart.png 47KB

.all-contributorsrc 2KB

docs

api-examples.md 907B

markdown-examples.md 1KB

public

logo.gif 231KB

.vitepress

theme

style.css 4KB

index.ts 435B

config.mts 2KB

api-examples.md 907B

markdown-examples.md 1KB

index.md 606B

index.md 1KB

ai00.md 973B

img

continuation.gif 753KB

ai00.gif 15KB

chat_en.gif 3.43MB

chat.gif 4.99MB

paper.gif 1.12MB

paper_en.gif 882KB

continuation_en.gif 784KB

package.json 459B

package-lock.json 123KB

Cargo.lock 104KB

codebase.md 11KB

README_zh.md 9KB

.gitignore 593B

README.md 10KB

issues.md 80B

# ð¯AI00 RWKV Server <p align='center'> <image src="docs/public/logo.gif" /> </p> <div align="center"> ![license](https://shields.io/badge/license-MIT%2FApache--2.0-blue) [![Rust Version](https://img.shields.io/badge/Rust-1.75.0+-blue)](https://releases.rs/docs/1.75.0) ![PRs welcome](https://img.shields.io/badge/PRs-Welcome-brightgreen)  [![All Contributors](https://img.shields.io/badge/all_contributors-7-orange.svg?style=flat-square)](#contributors-)  [English](README.md) | [ä¸æ](README_zh.md) --- <div align="left"> `AI00 RWKV Server` is an inference API server for the [`RWKV` language model](https://github.com/BlinkDL/ChatRWKV) based upon the [`web-rwkv`](https://github.com/cryscan/web-rwkv) inference engine. It supports `VULKAN` parallel and concurrent batched inference and can run on all GPUs that support `VULKAN`. No need for Nvidia cards!!! AMD cards and even integrated graphics can be accelerated!!! No need for bulky `pytorch`, `CUDA` and other runtime environments, it's compact and ready to use out of the box! Compatible with OpenAI's ChatGPT API interface. 100% open source and commercially usable, under the MIT license. If you are looking for a fast, efficient, and easy-to-use LLM API server, then `AI00 RWKV Server` is your best choice. It can be used for various tasks, including chatbots, text generation, translation, and Q&A. Join the `AI00 RWKV Server` community now and experience the charm of AI! QQ Group for communication: 30920262 ### ð¥Features * Based on the `RWKV` model, it has high performance and accuracy * Supports `VULKAN` inference acceleration, you can enjoy GPU acceleration without the need for `CUDA`! Supports AMD cards, integrated graphics, and all GPUs that support `VULKAN` * No need for bulky `pytorch`, `CUDA` and other runtime environments, it's compact and ready to use out of the box! * Compatible with OpenAI's ChatGPT API interface ### âUsages * Chatbots * Text generation * Translation * Q&A * Any other tasks that LLM can do ### ð»Other * Based on the [web-rwkv](https://github.com/cryscan/web-rwkv) project * Model download: [V5](https://huggingface.co/cgisky/AI00_RWKV_V5) or [V6](https://huggingface.co/cgisky/ai00_rwkv_x060) ## Installation, Compilation, and Usage ### ð¦Download Pre-built Executables 1. Directly download the latest version from [Release](https://github.com/cgisky1980/ai00_rwkv_server/releases) 2. After [downloading the model](#ð»other), place the model in the `assets/models/` path, for example, `assets/models/RWKV-x060-World-3B-v2-20240228-ctx4096.st` 3. Optionally modify [`assets/Config.toml`](./assets/Config.toml) for model configurations like model path, quantization layers, etc. 4. Run in the command line ```bash $ ./ai00_rwkv_server ``` 5. Open the browser and visit the WebUI [`https://localhost:65530`](https://localhost:65530) ### ð(Optional) Build from Source 1. [Install Rust](https://www.rust-lang.org/) 2. Clone this repository ```bash $ git clone https://github.com/cgisky1980/ai00_rwkv_server.git $ cd ai00_rwkv_server ``` 3. After [downloading the model](#ð»other), place the model in the `assets/models/` path, for example, `assets/models/RWKV-x060-World-3B-v2-20240228-ctx4096.st` 4. Compile ```bash $ cargo build --release ``` 5. After compilation, run ```bash $ cargo run --release ``` 6. Open the browser and visit the WebUI [`https://localhost:65530`](https://localhost:65530) ### ðConvert the Model It only supports Safetensors models with the `.st` extension now. Models saved with the `.pth` extension using torch need to be converted before use. 1. [Download the `.pth` model](https://huggingface.co/BlinkDL) 2. In the [Release](https://github.com/cgisky1980/ai00_rwkv_server/releases) you could find an executable called `converter`. Run ```bash $ ./converter --input /path/to/model.pth ``` 3. If you are building from source, run ```bash $ cargo run --release --bin converter -- --input /path/to/model.pth ``` 4. Just like the steps mentioned above, place the model in the `.st` model in the `assets/models/` path and modify the model path in [`assets/Config.toml`](./assets/Config.toml) ## ðSupported Arguments * `--config`: Configure file path (default: `assets/Config.toml`) * `--ip`: The IP address the server is bound to * `--port`: Running port ## ðCurrently Available APIs The API service starts at port 65530, and the data input and output format follow the Openai API specification. * `/api/oai/v1/models` * `/api/oai/models` * `/api/oai/v1/chat/completions` * `/api/oai/chat/completions` * `/api/oai/v1/completions` * `/api/oai/completions` * `/api/oai/v1/embeddings` * `/api/oai/embeddings` ## ðWebUI Screenshots ### Chat Feature <image src="img/chat_en.gif" /> ### Continuation Feature <image src="img/continuation_en.gif" /> ### Paper Writing Feature <image src="img/paper_en.gif" /> ## ðTODO List * [x] Support for `text_completions` and `chat_completions` * [x] Support for sse push * [x] Add `embeddings` * [x] Integrate basic front-end * [x] Parallel inference via `batch serve` * [x] Support for `int8` quantization * [x] Support for `NF4` quantization * [x] Support for `LoRA` model * [ ] Hot loading and switching of `LoRA` model ## ð¥Join Us We are always looking for people interested in helping us improve the project. If you are interested in any of the following, please join us! * ðWriting code * ð¬Providing feedback * ðProposing ideas or needs * ðTesting new features * âTranslating documentation * ð£Promoting the project * ðAnything else that would be helpful to us No matter your skill level, we welcome you to join us. You can join us in the following ways: * Join our Discord channel * Join our QQ group * Submit issues or pull requests on GitHub * Leave feedback on our website We can't wait to work with you to make this project better! We hope the project is helpful to you! ## Thank you to these awesome individuals who are insightful and outstanding for their support and selfless dedication to the project    <table> <tbody> <tr> <td align="center" valign="top" width="14.28%"><a href="https://github.com/cgisky1980"><img src="https://avatars.githubusercontent.com/u/82481660?v=4?s=100" width="100px;" alt="é¡¾çç"/><br /><sub><b>é¡¾çç</b></sub></a><br /><a href="https://github.com/Ai00-X/ai00_server/commits?author=cgisky1980" title="Documentation">ð</a> <a href="https://github.com/Ai00-X/ai00_server/commits?author=cgisky1980" title="Code">ð»</a> <a href="#content-cgisky1980" title="Content">ð</a> <a href="#design-cgisky1980" title="Design">ð¨</a> <a href="#mentoring-cgisky1980" title="Mentoring">ð§âð«</a></td> <td align="center" valign="top" width="14.28%"><a href="https://cryscan.github.io/profile"><img src="https://avatars.githubusercontent.com/u/16053640?v=4?s=100" width="100px;" alt="ç ç©¶ç¤¾äº¤"/><br /><sub><b>ç ç©¶ç¤¾äº¤</b></sub></a><br /><a href="https://github.com/Ai00-X/ai00_server/commits?author=cryscan" title="Code">ð»</a> <a href="#example-cryscan" title="Examples">ð¡</a> <a href="#ideas-cryscan" title="Ideas, Planning, & Feedback">ð¤</a> <a href="#maintenance-cryscan" title="Maintenance">ð§</a> <a href="https://github.com/Ai00-X/ai00_server/pulls?q=is%3Apr+reviewed-by%3Acryscan" title="Reviewed Pull Requests">ð</a> <a href="#platform-cryscan" title="Packaging/porting to new platform">ð¦</a></td> <td align="center" valign="to

评论收藏

内容反馈

版权申诉