# chatglm-maths
chatglm-6b微调/LORA/PPO/推理, 样本为自动生成的整数/小数加减乘除运算, 可gpu/cpu
## 数据集-中文
- [https://github.com/tatsu-lab/stanford_alpaca](https://github.com/tatsu-lab/stanford_alpaca)
- [https://github.com/LianjiaTech/BELLE](https://github.com/LianjiaTech/BELLE)
- [https://github.com/carbonz0/alpaca-chinese-dataset](https://github.com/carbonz0/alpaca-chinese-dataset)
## 踩坑
```python
1. eps=1e-5(不要改小), 半精度float16, 以及LN采用的是Post-LN(泛化性更好) + DeepNorm, 【害, Attention前也有LN】目的是大模型为了防止梯度溢出等;
2. 模型输入输出, 默认的tokenization_chatglm.py/modeling_chatglm.py不能用, 因为那是完全为生成generate设置的, 需要自己写好所有缩入参数, 或者机子改成适配的;
2.1 ChatGLMModel中, get_masks()正常, get_position_ids()函数中‘context_length = seq.index(150004) + 1’ 改为 ‘context_length = len(seq)’;
2.2 训练输入input_ids格式暂定为(训练后post-padding, 推理前pre-padding[tokenization_chatglm.py默认pre-padding])
x: prompt_1 + "_" + text_1 + "\n" + prompt_2 + [gMASK] + [BOS] + "_" + text_2 + [PAD]*N
2.3 训练输入label_ids格式暂定为(CrossEntropyLoss默认忽略-100不参与计算loss)
y = [-100]*len(text_1) + [BOS] + text_2 + [EOS] + [-100]*N
2.4 注意position/mask(自带的只是推理用的batch_size=1, 所以训练输入还得自己写), 可参考GLM-130的README.md, huozhe 查看GLM-1源码https://github.com/THUDM/GLM/blob/main/tasks/seq2seq/dataset.py
3. 注意chatglm-6b权重是float16的, 不过计算loss时候会转成float32计算, 最后loss再转回float16更新梯度;
4. ChatGLMTokenizer有时候会报奇奇怪怪的错误, 建议生成时候设置max_new_tokens, 最大{"max_new_tokens": 2048}; decode有时候会出现不存在id;
5. 低秩自适应LORA, RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
尝试 transformers升级到最新, get_peft_model后再.cuda(), device_map={'':torch.cuda.current_device()},
```
## 环境配置
```shell
transformers>=4.26.1
cpm_kernels==1.0.11
icetk==0.0.4
torch>=1.10.1
rouge==1.0.1
nltk==3.6.6
peft>=0.2.0
numpy
tqdm
lion_pytorch
macropodus
trl>=0.4.1
```
## 微调-计算题
```shell
lora
微调: python c00_toy_lora_train_6b.py
推理: python p00_toy_lora_predict_6b.py
ppo
训练: python t10_toy_trl_train_ppo.py
测试: python t10_toy_trl_predict_ppo.py
6b
微调: python c00_toy_cpu_train_6b.py
推理: python p00_toy_cpu_predit_6b.py
small-layer
微调: python c01_toy_cpu_train_small.py
推理: python p01_toy_cpu_predict_small.py
```
## 参考/感谢
- [https://github.com/THUDM/ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B)
- [https://github.com/THUDM/GLM](https://github.com/THUDM/GLM)
- [https://github.com/tatsu-lab/stanford_alpaca](https://github.com/tatsu-lab/stanford_alpaca)
- [https://github.com/LianjiaTech/BELLE](https://github.com/LianjiaTech/BELLE)
- [https://github.com/huggingface/peft](https://github.com/huggingface/peft)
- [https://github.com/mymusise/ChatGLM-Tuning](https://github.com/mymusise/ChatGLM-Tuning)
- [https://github.com/bojone/bert4keras](https://github.com/bojone/bert4keras)
- [trl](https://github.com/lvwerra/trl)
- [math23k](https://aclanthology.org/D17-1088)
## 推理日志toy
```cpu
generator_calculate_line: ('13+75=', '13+75=88')
tokenizer.vocab_size: 150344
eval: 0%| | 0/1 [00:00<?, ?it/s]batch_query: ['简便运算: 98+83= 剖析: 98+83=181']
batch_qtext_0: 简便运算: 98+83= 剖析:
batch_qans_0: 98+83=181
response_0: 98+83=171
{'rouge-1': 0.0, 'rouge-2': 0.0, 'rouge-l': 0.0, 'bleu': 0.0}
请输入:
25.31+86.35=
请稍等...
25.31+86.35=101.66
```
## 微调日志toy
```cpu
generator_calculate_line: ('13+75=', '13+75=88')
tokenizer.vocab_size: 150344
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:10<00:00, 1.31s/it]
transformer.word_embeddings.weight False
......
transformer.layers.26.mlp.dense_4h_to_h.bias False
transformer.layers.27.input_layernorm.weight True
transformer.layers.27.input_layernorm.bias True
transformer.layers.27.attention.query_key_value.weight True
transformer.layers.27.attention.query_key_value.bias True
transformer.layers.27.attention.dense.weight True
transformer.layers.27.attention.dense.bias True
transformer.layers.27.post_attention_layernorm.weight True
transformer.layers.27.post_attention_layernorm.bias True
transformer.layers.27.mlp.dense_h_to_4h.weight True
transformer.layers.27.mlp.dense_h_to_4h.bias True
transformer.layers.27.mlp.dense_4h_to_h.weight True
transformer.layers.27.mlp.dense_4h_to_h.bias True
transformer.final_layernorm.weight True
transformer.final_layernorm.bias True
model.chat start
13+75=88, but that's not the correct answer. The correct answer is 13+75=88, which is 90.
/anaconda3/envs/py371/lib/python3.7/site-packages/transformers/optimization.py:395: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
FutureWarning,
epoch: 0%| | 0/21 [00:00<?, ?it/s]epochs:
batch_query: ['简便运算: 98+83= 剖析: 98+83=181'] | 0/8 [00:00<?, ?it/s]
epoch_global: 0, step_global: 1, step: 0, loss: 4.0625
batch_query: ['口算: 57.84+13.64 解: 57.84+13.64=71.48']
epoch_global: 0, step_global: 2, step: 1, loss: 2.5625███▌ | 2/8 [00:17<00:51, 8.54s/it]
batch_query: ['计算题: 48+1 解答: 48+1=49']
epoch_global: 0, step_global: 3, step: 2, loss: 4.15625█████████████████████▎ | 3/8 [00:38<01:09, 13.94s/it]
batch_query: ['计算题: 61.65+33.05 解答: 61.65+33.05=94.7']
epoch_global: 0, step_global: 4, step: 3, loss: 2.40625████████████████████████████████████████ | 4/8 [01:01<01:09, 17.43s/it]
batch_query: ['计算: 81+75 回答: 81+
没有合适的资源?快使用搜索试试~ 我知道了~
chatglm-6b微调进行数学计算.zip
共54个文件
py:22个
sample:13个
head:4个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 120 浏览量
2023-06-24
15:59:15
上传
评论
收藏 310KB ZIP 举报
温馨提示
本资源是大模型的微调教程内含数据集和模型连接,可以作为入门教程
资源推荐
资源详情
资源评论
收起资源包目录
chatglm-6b微调进行数学计算.zip (54个子文件)
chatglm-6b微调进行数学计算
LORA
PPO
推理, 样本为自动生成的整数
小数加减乘除运算
__init__.py 101B
.git
index 3KB
HEAD 21B
refs
heads
main 41B
tags
remotes
origin
HEAD 30B
objects
pack
pack-a2d357a64d82c0feb60777da7580025d5f830c90.idx 6KB
pack-a2d357a64d82c0feb60777da7580025d5f830c90.pack 149KB
info
description 73B
packed-refs 112B
info
exclude 240B
logs
HEAD 185B
refs
heads
main 185B
remotes
origin
HEAD 185B
hooks
post-update.sample 189B
prepare-commit-msg.sample 1KB
commit-msg.sample 896B
pre-receive.sample 544B
update.sample 4KB
pre-commit.sample 2KB
pre-rebase.sample 5KB
applypatch-msg.sample 478B
fsmonitor-watchman.sample 5KB
push-to-checkout.sample 3KB
pre-applypatch.sample 424B
pre-push.sample 1KB
pre-merge-commit.sample 416B
config 311B
chatglm_maths
chatglm_6b
__init__.py 101B
config.json 772B
tokenizer_config.json 440B
__init__.py 101B
t00_tet_chat_chatglm.py 3KB
math23k_trainset.sample.json 14KB
c00_toy_lora_train_6b.py 22KB
c00_toy_cpu_train_6b.py 21KB
p00_toy_lora_predict_6b.py 18KB
c00_toy_gpu_train_6b.py 21KB
p10_lora_trl_predict_ppo.py 8KB
p01_toy_cpu_predict_small.py 16KB
p00_toy_cpu_predit_6b.py 16KB
c01_toy_cpu_train_small.py 22KB
t10_lora_trl_train_ppo.py 9KB
models
__init__.py 101B
modeling_chatglm.py 56KB
tokenization_chatglm.py 17KB
quantization.py 15KB
configuration_chatglm.py 4KB
ppo_trainer.py 45KB
README.md 4KB
t10_toy_trl_train_ppo.py 11KB
p10_toy_trl_predict_ppo.py 8KB
c01_toy_gpu_train_small.py 21KB
requirements.txt 135B
README.md 65KB
共 54 条
- 1
资源评论
小码蚁.
- 粉丝: 2517
- 资源: 3976
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- Screenshot_2024-05-15-15-51-23-937_com.tencent.mm.jpg
- gcr.io/k8s-staging-sig-storage/nfs-subdir-external-provisioner
- content_1715761104170.m3u8
- 202391630110 贾增林.zip
- (python源码)基于症状的心脏病预测算法实现
- c# 反编译工具ILSpy 新版
- ESP8266-01模块继电器制作手机APP远程遥控智能开关
- (python源码)基于症状的疾病预测-采用了多种方法,决策树、MNB、随机森林等
- 天津理工大学中加计算机组成老师期末复习ppt
- (python源码)基于随机森林的人类疾病预测算法实现
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功