# Build a Large Language Model (From Scratch)
This repository contains the code for coding, pretraining, and finetuning a GPT-like LLM and is the official code repository for the book [Build a Large Language Model (From Scratch)](http://mng.bz/orYv).
(If you downloaded the code bundle from the Manning website, please consider visiting the official code repository on GitHub at [https://github.com/rasbt/LLMs-from-scratch](https://github.com/rasbt/LLMs-from-scratch).)
<br>
<br>
<a href="http://mng.bz/orYv"><img src="images/cover.jpg" width="250px"></a>
In [*Build a Large Language Model (from Scratch)*](http://mng.bz/orYv), you'll discover how LLMs work from the inside out. In this book, I'll guide you step by step through creating your own LLM, explaining each stage with clear text, diagrams, and examples.
The method described in this book for training and developing your own small-but-functional model for educational purposes mirrors the approach used in creating large-scale foundational models such as those behind ChatGPT.
- Link to the official [source code repository](https://github.com/rasbt/LLMs-from-scratch)
- [Link to the early access version](http://mng.bz/orYv) at Manning
- ISBN 9781633437166
- Publication in Early 2025 (estimated)
<br>
<br>
# Table of Contents
Please note that the `Readme.md` file is a Markdown (`.md`) file. If you have downloaded this code bundle from the Manning website and are viewing it on your local computer, I recommend using a Markdown editor or previewer for proper viewing. If you haven't installed a Markdown editor yet, [MarkText](https://www.marktext.cc) is a good free option.
Alternatively, you can view this and other files on GitHub at [https://github.com/rasbt/LLMs-from-scratch](https://github.com/rasbt/LLMs-from-scratch).
<br>
<br>
| Chapter Title | Main Code (for quick access) | All Code + Supplementary |
|------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------|-------------------------------|
| Ch 1: Understanding Large Language Models | No code | No code |
| Ch 2: Working with Text Data | - [ch02.ipynb](ch02/01_main-chapter-code/ch02.ipynb)<br/>- [dataloader.ipynb](ch02/01_main-chapter-code/dataloader.ipynb) (summary)<br/>- [exercise-solutions.ipynb](ch02/01_main-chapter-code/exercise-solutions.ipynb) | [./ch02](./ch02) |
| Ch 3: Coding Attention Mechanisms | - [ch03.ipynb](ch03/01_main-chapter-code/ch03.ipynb)<br/>- [multihead-attention.ipynb](ch03/01_main-chapter-code/multihead-attention.ipynb) (summary) <br/>- [exercise-solutions.ipynb](ch03/01_main-chapter-code/exercise-solutions.ipynb)| [./ch03](./ch03) |
| Ch 4: Implementing a GPT Model from Scratch | - [ch04.ipynb](ch04/01_main-chapter-code/ch04.ipynb)<br/>- [gpt.py](ch04/01_main-chapter-code/gpt.py) (summary)<br/>- [exercise-solutions.ipynb](ch04/01_main-chapter-code/exercise-solutions.ipynb) | [./ch04](./ch04) |
| Ch 5: Pretraining on Unlabeled Data | Q1 2024 | ... |
| Ch 6: Finetuning for Text Classification | Q2 2024 | ... |
| Ch 7: Finetuning with Human Feedback | Q2 2024 | ... |
| Ch 8: Using Large Language Models in Practice | Q2/3 2024 | ... |
| Appendix A: Introduction to PyTorch | - [code-part1.ipynb](appendix-A/03_main-chapter-code/code-part1.ipynb)<br/>- [code-part2.ipynb](appendix-A/03_main-chapter-code/code-part2.ipynb)<br/>- [DDP-script.py](appendix-A/03_main-chapter-code/DDP-script.py)<br/>- [exercise-solutions.ipynb](appendix-A/03_main-chapter-code/exercise-solutions.ipynb) | [./appendix-A](./appendix-A) |
| Appendix B: References and Further Reading | No code | |
| Appendix C: Exercises | No code | |
<br>
> [!TIP]
> Please see [this](appendix-A/01_optional-python-setup-preferences) and [this](appendix-A/02_installing-python-libraries) folder if you need more guidance on installing Python and Python packages.
<br>
<br>
Shown below is a mental model summarizing the contents covered in this book.
<img src="images/mental-model.jpg" width="600px">
没有合适的资源?快使用搜索试试~ 我知道了~
从头开始逐步实现类似 ChatGPT 的 LLM
共119个文件
webp:39个
png:22个
md:17个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 158 浏览量
2024-03-11
09:21:46
上传
评论
收藏 4.06MB ZIP 举报
温馨提示
该存储库包含用于编码、预训练和微调类 GPT LLM 的代码,并且是《 Build a Large Language Model (From Scratch)》一书的官方代码存储库。 本项目中描述的用于训练和开发自己的小而实用的模型用于教育目的的方法反映了用于创建大型基础模型的方法,例如 ChatGPT 背后的模型。
资源推荐
资源详情
资源评论
收起资源包目录
从头开始逐步实现类似 ChatGPT 的 LLM (119个子文件)
Dockerfile 248B
.gitignore 3KB
mha-implementations.ipynb 102KB
ch04.ipynb 84KB
ch03.ipynb 59KB
ch02.ipynb 48KB
code-part1.ipynb 27KB
embeddings-and-linear-layers.ipynb 12KB
exercise-solutions.ipynb 12KB
multihead-attention.ipynb 12KB
code-part2.ipynb 11KB
compare-bpe-tiktoken.ipynb 9KB
exercise-solutions.ipynb 8KB
exercise-solutions.ipynb 8KB
dataloader.ipynb 5KB
exercise-solutions.ipynb 4KB
python_environment_check.ipynb 1KB
mental-model.jpg 174KB
check_1.jpg 107KB
jupyter-issues.jpg 103KB
pytorch-installer.jpg 95KB
pytorch-installer.jpg 95KB
check_2.jpg 79KB
cover.jpg 47KB
watermark.jpg 36KB
devcontainer.json 361B
README.md 5KB
README.md 5KB
README.md 4KB
README.md 2KB
README.md 2KB
README.md 502B
README.md 486B
README.md 284B
README.md 265B
README.md 264B
README.md 254B
README.md 249B
README.md 241B
README.md 211B
README.md 183B
README.md 147B
README.md 84B
4.png 291KB
5.png 289KB
miniforge-install.png 258KB
check-pip.png 220KB
3.png 216KB
conda-install.png 187KB
new-env.png 185KB
activate-env.png 180KB
download.png 174KB
attention-matrix.png 136KB
1.png 133KB
2.png 133KB
dot-product.png 93KB
single-head.png 71KB
attention.png 67KB
dropout.png 63KB
weight-selfattn-2.png 61KB
multi-head.png 60KB
masked.png 59KB
weight-selfattn-4.png 54KB
weight-selfattn-3.png 53KB
weight-selfattn-1.png 52KB
gpt.py 9KB
previous_chapters.py 9KB
hparam_search.py 7KB
bpe_openai_gpt2.py 7KB
DDP-script.py 5KB
ch03.py 4KB
previous_chapters.py 4KB
python_environment_check.py 2KB
the-verdict.txt 20KB
the-verdict.txt 20KB
small-text-sample.txt 2KB
LICENSE.txt 1KB
requirements.txt 137B
requirements.txt 137B
requirements-extra.txt 35B
2.webp 62KB
generate-text.webp 36KB
1.webp 35KB
shortcut-example.webp 32KB
10.webp 32KB
3.webp 30KB
gpt.webp 30KB
chapter-steps.webp 29KB
layernorm.webp 27KB
transformer-block.webp 26KB
mental-model.webp 25KB
1.webp 24KB
ffn.webp 24KB
iterative-generate.webp 24KB
6.webp 23KB
12.webp 21KB
mental-model-3.webp 21KB
mental-model-final.webp 21KB
gpt-in-out.webp 21KB
overview-after-ln.webp 21KB
共 119 条
- 1
- 2
资源评论
sjx_alo
- 粉丝: 1w+
- 资源: 1235
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 100kw模块式三相光伏并网逆变器方案 提供原理图,pcb,源码以及元器件明细表 如下: 1) 功率接口板原理图和pcb,元器件明细表 2) 主控DSP板原理图(pdf);元器件明细表以及
- LLC谐振变器增益曲线绘制,电压增益与品质因数和电感比关系图程序
- 打开文件夹下载【快回精灵】安装包等1个文件.rar.zip
- 永磁同步电机,pll无位置传感器控制 pmsm+pll
- 粒子群算法配电网故障重构 可以自己任意设置故障点,目标函数为功率损耗 程序清晰明了,注释详细
- PMSM永磁同步电机的IF+反正切控制算法仿真, 无感控制算法仿真 电机模型,需要可只拿,参数可定制
- 前端分析-2023071100789
- 自动驾驶控制-基于MPC的速度控制仿真 matlab和simulink联合仿真,基于mpc算法的速度控制,跟踪阶跃形式的速度和正弦形式的速度
- MPC跟踪轨迹圆形(以后轴为基准)
- 前端分析-2023071100789
- 基于labview的DAQmx持续读取源码.zip
- 基于labview的Google_Earth和LV源码.zip
- 基于labview的Excel相关源码.zip
- 基于labview的labview8.6如你所愿源码.zip
- 基于labview的labview8.6自动连接条件结构源码.zip
- 基于labview的LabVIEW钢琴源码.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功