---
license: llama3
library_name: transformers
pipeline_tag: text-generation
base_model: meta-llama/Meta-Llama-3-8B-Instruct
language:
- en
- zh
tags:
- llama-factory
- orpo
---
ð We included all instructions on how to download, use, and reproduce our various kinds of models at [this GitHub repo](https://github.com/Shenzhi-Wang/Llama3-Chinese-Chat). If you like our models, we would greatly appreciate it if you could star our Github repository. Additionally, please click "like" on our HuggingFace repositories. Thank you!
âï¸âï¸âï¸NOTICE: The main branch contains the files for Llama3-8B-Chinese-Chat-**v2.1**. If you want to use our Llama3-8B-Chinese-Chat-**v1**, please refer to [the `v1` branch](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/tree/v1); if you want to use our Llama3-8B-Chinese-Chat-**v2**, please refer to [the `v2` branch](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/tree/v2).
âï¸âï¸âï¸NOTICE: For optimal performance, we refrain from fine-tuning the model's identity. Thus, inquiries such as "Who are you" or "Who developed you" may yield random responses that are not necessarily accurate.
# Updates
- ððð [May 6, 2024] We now introduce Llama3-8B-Chinese-Chat-**v2.1**! Compared to v1, the training dataset of v2.1 is **5x larger** (~100K preference pairs), and it exhibits significant enhancements, especially in **roleplay**, **function calling**, and **math** capabilities! Compared to v2, v2.1 surpasses v2 in **math** and is **less prone to including English words in Chinese responses**. The training dataset of Llama3-8B-Chinese-Chat-v2.1 will be released soon. If you love our Llama3-8B-Chinese-Chat-v1 or v2, you won't want to miss out on Llama3-8B-Chinese-Chat-v2.1!
- ð¥ We provide an online interactive demo for Llama3-8B-Chinese-Chat-v2 [here](https://huggingface.co/spaces/llamafactory/Llama3-8B-Chinese-Chat). Have fun with our latest model!
- ð¥ We provide the official **Ollama model for the q4_0 GGUF** version of Llama3-8B-Chinese-Chat-v2.1 at [wangshenzhi/llama3-8b-chinese-chat-ollama-q4](https://ollama.com/wangshenzhi/llama3-8b-chinese-chat-ollama-q4)! Run the following command for quick use of this model: `ollama run wangshenzhi/llama3-8b-chinese-chat-ollama-q4`.
- ð¥ We provide the official **Ollama model for the q8_0 GGUF** version of Llama3-8B-Chinese-Chat-v2.1 at [wangshenzhi/llama3-8b-chinese-chat-ollama-q8](https://ollama.com/wangshenzhi/llama3-8b-chinese-chat-ollama-q8)! Run the following command for quick use of this model: `ollama run wangshenzhi/llama3-8b-chinese-chat-ollama-q8`.
- ð¥ We provide the official **Ollama model for the f16 GGUF** version of Llama3-8B-Chinese-Chat-v2.1 at [wangshenzhi/llama3-8b-chinese-chat-ollama-fp16](https://ollama.com/wangshenzhi/llama3-8b-chinese-chat-ollama-fp16)! Run the following command for quick use of this model: `ollama run wangshenzhi/llama3-8b-chinese-chat-ollama-fp16`.
- ð¥ We provide the official **q4_0 GGUF** version of Llama3-8B-Chinese-Chat-**v2.1** at https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-4bit!
- ð¥ We provide the official **q8_0 GGUF** version of Llama3-8B-Chinese-Chat-**v2.1** at https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit!
- ð¥ We provide the official **f16 GGUF** version of Llama3-8B-Chinese-Chat-**v2.1** at https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-f16!
<details>
<summary><b>Updates for Llama3-8B-Chinese-Chat-v2 [CLICK TO EXPAND]</b></summary>
- ð¥ Llama3-8B-Chinese-v2's link: https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/tree/v2
- ð¥ We provide the official f16 GGUF version of Llama3-8B-Chinese-Chat-**v2** at https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-f16/tree/v2!
- ð¥ We provide the official 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v2** at https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit/tree/v2!
- ð¥ We provide an online interactive demo for Llama3-8B-Chinese-Chat-v2 (https://huggingface.co/spaces/llamafactory/Llama3-8B-Chinese-Chat). Have fun with our latest model!
- ððð [Apr. 29, 2024] We now introduce Llama3-8B-Chinese-Chat-**v2**! Compared to v1, the training dataset of v2 is **5x larger** (~100K preference pairs), and it exhibits significant enhancements, especially in **roleplay**, **function calling**, and **math** capabilities! If you love our Llama3-8B-Chinese-Chat-v1, you won't want to miss out on Llama3-8B-Chinese-Chat-v2!
</details>
<details>
<summary><b>Updates for Llama3-8B-Chinese-Chat-v1 [CLICK TO EXPAND]</b></summary>
- ð¥ Llama3-8B-Chinese-v1's link: https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/tree/v1
- ð¥ We provide the official Ollama model for the f16 GGUF version of Llama3-8B-Chinese-Chat-**v1** at [wangshenzhi/llama3-8b-chinese-chat-ollama-f16](https://ollama.com/wangshenzhi/llama3-8b-chinese-chat-ollama-f16)! Run the following command for quick use of this model: `ollama run wangshenzhi/llama3-8b-chinese-chat-ollama-fp16`.
- ð¥ We provide the official Ollama model for the 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v1** at [wangshenzhi/llama3-8b-chinese-chat-ollama-q8](https://ollama.com/wangshenzhi/llama3-8b-chinese-chat-ollama-q8)! Run the following command for quick use of this model: `ollama run wangshenzhi/llama3-8b-chinese-chat-ollama-q8`.
- ð¥ We provide the official f16 GGUF version of Llama3-8B-Chinese-Chat-**v1** at [shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-f16-v1](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-f16/tree/v1)!
- ð¥ We provide the official 8bit-quantized GGUF version of Llama3-8B-Chinese-Chat-**v1** at [shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit-v1](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit/tree/v1)!
- ð If you are in China, you can download our **v1** model from our [Gitee AI repository](https://ai.gitee.com/hf-models/shenzhi-wang/Llama3-8B-Chinese-Chat).
</details>
<br />
# Model Summary
Llama3-8B-Chinese-Chat is an instruction-tuned language model for Chinese & English users with various abilities such as roleplaying & tool-using built upon the Meta-Llama-3-8B-Instruct model.
Developed by: [Shenzhi Wang](https://shenzhi-wang.netlify.app) (çæ
æ§) and [Yaowei Zheng](https://github.com/hiyouga) (éèå¨)
- License: [Llama-3 License](https://llama.meta.com/llama3/license/)
- Base Model: Meta-Llama-3-8B-Instruct
- Model Size: 8.03B
- Context length: 8K
# 1. Introduction
This is the first model specifically fine-tuned for Chinese & English user through ORPO [1] based on the [Meta-Llama-3-8B-Instruct model](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct).
**Compared to the original [Meta-Llama-3-8B-Instruct model](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), our Llama3-8B-Chinese-Chat-v1 model significantly reduces the issues of "Chinese questions with English answers" and the mixing of Chinese and English in responses.**
**Compared to [Llama3-8B-Chinese-Chat-v1](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/tree/v1), our Llama3-8B-Chinese-Chat-v2 model significantly increases the training data size (from 20K to 100K), which introduces great performance enhancement, especially in roleplay, tool using, and math.**
[1] Hong, Jiwoo, Noah Lee, and James Thorne. "Reference-free Monolithic Preference Optimization with Odds Ratio." arXiv preprint arXiv:2403.07691 (2024).
Training framework: [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory).
Training details:
- epochs: 2
- learning rate: 3e-6
- learning rate scheduler type: cosine
- Warmup ratio: 0.1
- cutoff len (i.e. context length): 8192
- orpo beta (i.e. $\lambda$ in the ORPO paper): 0.05
- global batch size: 128
- fine-tuning type: full parameters
- optimizer: paged_adamw_32bit
<details>
<summary><b>To reproduce the model [CLICK TO EXPAND]</b></summary>
To reproduce Llama3-8B-Chinese-Cha
没有合适的资源?快使用搜索试试~ 我知道了~
LLama3 中文大模型进行指令微调的中文聊天语言模型
共13个文件
json:6个
safetensors:4个
gitattributes:1个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 45 浏览量
2024-06-06
20:28:48
上传
评论
收藏 2.25MB ZIP 举报
温馨提示
Llama3-8B-Chinese-Chat 是一款基于 Meta-Llama-3-8B-Instruct 模型进行指令微调的中文聊天语言模型。该模型针对中文和英文用户进行了专门的优化,具有角色扮演、工具使用、数学计算等多种功能。最新的 v2.1 版本相较于之前的版本,训练数据量更大,性能更强大,尤其在角色扮演、工具调用和数学能力方面表现突出。该模型基于 Meta-Llama-3-8B-Instruct 模型,专门针对中文和英文用户进行优化,能够减少中文问题出现英文回答的现象。v2.1 版本相比 v1 版本,训练数据量增加了 5 倍,显著提高了模型在角色扮演、工具调用和数学计算方面的能力。该模型还提供了 Ollama 模型和 GGUF 版本,方便用户使用。
资源推荐
资源详情
资源评论
收起资源包目录
Llama3-8B-Chinese-Chat-main.zip (13个子文件)
Llama3-8B-Chinese-Chat-main
model-00003-of-00004.safetensors 135B
.gitattributes 1KB
model-00002-of-00004.safetensors 135B
LICENSE 8KB
tokenizer.json 8.66MB
generation_config.json 147B
model.safetensors.index.json 23KB
model-00001-of-00004.safetensors 135B
config.json 649B
tokenizer_config.json 50KB
special_tokens_map.json 97B
README.md 41KB
model-00004-of-00004.safetensors 135B
共 13 条
- 1
资源评论
传奇开心果编程
- 粉丝: 1w+
- 资源: 454
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- (源码)基于Java和MySQL的学生信息管理系统.zip
- (源码)基于ASP.NET Core的零售供应链管理系统.zip
- (源码)基于PythonSpleeter的戏曲音频处理系统.zip
- (源码)基于Spring Boot的监控与日志管理系统.zip
- (源码)基于C++的Unix V6++二级文件系统.zip
- (源码)基于Spring Boot和JPA的皮皮虾图片收集系统.zip
- (源码)基于Arduino和Python的实时歌曲信息液晶显示屏展示系统.zip
- (源码)基于C++和C混合模式的操作系统开发项目.zip
- (源码)基于Arduino的全球天气监控系统.zip
- OpenCVForUnity2.6.0.unitypackage
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功