基于RWKV大模型RWKVWorld模型数据集植物花卉数据集[PlantFlowerDatasets资源-CSDN文库

共26个文件

py：7个

idx：3个

bin：3个

版权申诉

毕业设计

课程设计

85 浏览量 2024-05-08 14:50:00 上传评论收藏 37.7MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

基于RWKV大模型RWKV World模型数据集植物花卉数据集[PlantFlower Datasets]源码+详细说明+全部数据资料高分项目.zip （26个子文件）

PlantFlowerDatasets-main

data

PlantFlower_text_document.bin 8.3MB

plantflower_cnflora_data_text_document.bin 70.81MB

plantflower_cnflora_data_text_document.idx 704KB

PlantFlower_text_document.idx 100KB

cnflora_data_text_document.bin 62.21MB

cnflora_data_text_document.idx 604KB

LICENSE 1KB

pic20230621195113.png 113KB

pic20230706105642.png 193KB

RWKV-LM-LoRA[World模型]训练文件

RWKV-v4neo

src

rwkv_vocab_v20230424.txt 1.04MB

utils_word.py 5KB

rwkv_tokenizer.py 3KB

chat_word.py 19KB

rwkv_pip_package

src

rwkv

utils.py 5KB

__init__.py 0B

rwkv_vocab_v20230424.txt 1.04MB

model.py 32KB

cuda

wrapper.cpp 5KB

operators.cu 8KB

rwkv_tokenizer.py 3KB

LICENSE 11KB

pyproject.toml 499B

MANIFEST.in 47B

README.md 4KB

171265889347208773632.zip 416B

The RWKV Language Model https://github.com/BlinkDL/ChatRWKV https://github.com/BlinkDL/RWKV-LM ```python # set these before import RWKV os.environ['RWKV_JIT_ON'] = '1' os.environ["RWKV_CUDA_ON"] = '0' # '1' to compile CUDA kernel (10x faster), requires c++ compiler & cuda libraries ######################################################################################################## # # Use '/' in model path, instead of '\'. Use ctx4096 models if you need long ctx. # # fp16 = good for GPU (!!! DOES NOT support CPU !!!) # fp32 = good for CPU # bf16 = worse accuracy, supports CPU # xxxi8 (example: fp16i8, fp32i8) = xxx with int8 quantization to save 50% VRAM/RAM, slower, slightly less accuracy # # We consider [ln_out+head] to be an extra layer, so L12-D768 (169M) has "13" layers, L24-D2048 (1.5B) has "25" layers, etc. # Strategy Examples: (device = cpu/cuda/cuda:0/cuda:1/...) # 'cpu fp32' = all layers cpu fp32 # 'cuda fp16' = all layers cuda fp16 # 'cuda fp16i8' = all layers cuda fp16 with int8 quantization # 'cuda fp16i8 *10 -> cpu fp32' = first 10 layers cuda fp16i8, then cpu fp32 (increase 10 for better speed) # 'cuda:0 fp16 *10 -> cuda:1 fp16 *8 -> cpu fp32' = first 10 layers cuda:0 fp16, then 8 layers cuda:1 fp16, then cpu fp32 # # Basic Strategy Guide: (fp16i8 works for any GPU) # 100% VRAM = 'cuda fp16' # all layers cuda fp16 # 98% VRAM = 'cuda fp16i8 *1 -> cuda fp16' # first 1 layer cuda fp16i8, then cuda fp16 # 96% VRAM = 'cuda fp16i8 *2 -> cuda fp16' # first 2 layers cuda fp16i8, then cuda fp16 # 94% VRAM = 'cuda fp16i8 *3 -> cuda fp16' # first 3 layers cuda fp16i8, then cuda fp16 # ... # 50% VRAM = 'cuda fp16i8' # all layers cuda fp16i8 # 48% VRAM = 'cuda fp16i8 -> cpu fp32 *1' # most layers cuda fp16i8, last 1 layer cpu fp32 # 46% VRAM = 'cuda fp16i8 -> cpu fp32 *2' # most layers cuda fp16i8, last 2 layers cpu fp32 # 44% VRAM = 'cuda fp16i8 -> cpu fp32 *3' # most layers cuda fp16i8, last 3 layers cpu fp32 # ... # 0% VRAM = 'cpu fp32' # all layers cpu fp32 # # Use '+' for STREAM mode, which can save VRAM too, and it is sometimes faster # 'cuda fp16i8 *10+' = first 10 layers cuda fp16i8, then fp16i8 stream the rest to it (increase 10 for better speed) # # Extreme STREAM: 3G VRAM is enough to run RWKV 14B (slow. will be faster in future) # 'cuda fp16i8 *0+ -> cpu fp32 *1' = stream all layers cuda fp16i8, last 1 layer [ln_out+head] cpu fp32 # # ######################################################################################################## from rwkv.model import RWKV from rwkv.utils import PIPELINE, PIPELINE_ARGS # download models: https://huggingface.co/BlinkDL model = RWKV(model='/fsx/BlinkDL/HF-MODEL/rwkv-4-pile-169m/RWKV-4-Pile-169M-20220807-8023', strategy='cpu fp32') pipeline = PIPELINE(model, "20B_tokenizer.json") # 20B_tokenizer.json is in https://github.com/BlinkDL/ChatRWKV ctx = "\nIn a shocking finding, scientist discovered a herd of dragons living in a remote, previously unexplored valley, in Tibet. Even more surprising to the researchers was the fact that the dragons spoke perfect Chinese." print(ctx, end='') def my_print(s): print(s, end='', flush=True) # For alpha_frequency and alpha_presence, see "Frequency and presence penalties": # https://platform.openai.com/docs/api-reference/parameter-details args = PIPELINE_ARGS(temperature = 1.0, top_p = 0.7, top_k = 100, # top_k = 0 then ignore alpha_frequency = 0.25, alpha_presence = 0.25, alpha_decay=0.996, # gradually decay the penalty token_ban = [0], # ban the generation of some tokens token_stop = [], # stop generation whenever you see any token here chunk_len = 256) # split input into chunks to save VRAM (shorter -> slower) pipeline.generate(ctx, token_count=200, args=args, callback=my_print) print('\n') out, state = model.forward([187, 510, 1563, 310, 247], None) print(out.detach().cpu().numpy()) # get logits out, state = model.forward([187, 510], None) out, state = model.forward([1563], state) # RNN has state (use deepcopy to clone states) out, state = model.forward([310, 247], state) print(out.detach().cpu().numpy()) # same result as above print('\n') ```

评论收藏

内容反馈

版权申诉