从头开始逐步实现类似ChatGPT的LLM_BuildaLargeLanguageModel资源-CSDN文库

共119个文件

webp：39个

png：22个

md：17个

版权申诉

自然语言处理

158 浏览量 2024-03-11 09:21:46 上传评论收藏 4.06MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

从头开始逐步实现类似 ChatGPT 的 LLM （119个子文件）

Dockerfile 248B

.gitignore 3KB

mha-implementations.ipynb 102KB

ch04.ipynb 84KB

ch03.ipynb 59KB

ch02.ipynb 48KB

code-part1.ipynb 27KB

embeddings-and-linear-layers.ipynb 12KB

exercise-solutions.ipynb 12KB

multihead-attention.ipynb 12KB

code-part2.ipynb 11KB

compare-bpe-tiktoken.ipynb 9KB

exercise-solutions.ipynb 8KB

dataloader.ipynb 5KB

exercise-solutions.ipynb 4KB

python_environment_check.ipynb 1KB

mental-model.jpg 174KB

check_1.jpg 107KB

jupyter-issues.jpg 103KB

pytorch-installer.jpg 95KB

check_2.jpg 79KB

cover.jpg 47KB

watermark.jpg 36KB

devcontainer.json 361B

README.md 5KB

README.md 4KB

README.md 2KB

README.md 502B

README.md 486B

README.md 284B

README.md 265B

README.md 264B

README.md 254B

README.md 249B

README.md 241B

README.md 211B

README.md 183B

README.md 147B

README.md 84B

4.png 291KB

5.png 289KB

miniforge-install.png 258KB

check-pip.png 220KB

3.png 216KB

conda-install.png 187KB

new-env.png 185KB

activate-env.png 180KB

download.png 174KB

attention-matrix.png 136KB

1.png 133KB

2.png 133KB

dot-product.png 93KB

single-head.png 71KB

attention.png 67KB

dropout.png 63KB

weight-selfattn-2.png 61KB

multi-head.png 60KB

masked.png 59KB

weight-selfattn-4.png 54KB

weight-selfattn-3.png 53KB

weight-selfattn-1.png 52KB

gpt.py 9KB

previous_chapters.py 9KB

hparam_search.py 7KB

bpe_openai_gpt2.py 7KB

DDP-script.py 5KB

ch03.py 4KB

previous_chapters.py 4KB

python_environment_check.py 2KB

the-verdict.txt 20KB

small-text-sample.txt 2KB

LICENSE.txt 1KB

requirements.txt 137B

requirements-extra.txt 35B

2.webp 62KB

generate-text.webp 36KB

1.webp 35KB

shortcut-example.webp 32KB

10.webp 32KB

3.webp 30KB

gpt.webp 30KB

chapter-steps.webp 29KB

layernorm.webp 27KB

transformer-block.webp 26KB

mental-model.webp 25KB

1.webp 24KB

ffn.webp 24KB

iterative-generate.webp 24KB

6.webp 23KB

12.webp 21KB

mental-model-3.webp 21KB

mental-model-final.webp 21KB

gpt-in-out.webp 21KB

overview-after-ln.webp 21KB

共 119 条

# Build a Large Language Model (From Scratch) This repository contains the code for coding, pretraining, and finetuning a GPT-like LLM and is the official code repository for the book [Build a Large Language Model (From Scratch)](http://mng.bz/orYv). (If you downloaded the code bundle from the Manning website, please consider visiting the official code repository on GitHub at [https://github.com/rasbt/LLMs-from-scratch](https://github.com/rasbt/LLMs-from-scratch).) <a href="http://mng.bz/orYv"><img src="images/cover.jpg" width="250px"></a> In [*Build a Large Language Model (from Scratch)*](http://mng.bz/orYv), you'll discover how LLMs work from the inside out. In this book, I'll guide you step by step through creating your own LLM, explaining each stage with clear text, diagrams, and examples. The method described in this book for training and developing your own small-but-functional model for educational purposes mirrors the approach used in creating large-scale foundational models such as those behind ChatGPT. - Link to the official [source code repository](https://github.com/rasbt/LLMs-from-scratch) - [Link to the early access version](http://mng.bz/orYv) at Manning - ISBN 9781633437166 - Publication in Early 2025 (estimated) # Table of Contents Please note that the `Readme.md` file is a Markdown (`.md`) file. If you have downloaded this code bundle from the Manning website and are viewing it on your local computer, I recommend using a Markdown editor or previewer for proper viewing. If you haven't installed a Markdown editor yet, [MarkText](https://www.marktext.cc) is a good free option. Alternatively, you can view this and other files on GitHub at [https://github.com/rasbt/LLMs-from-scratch](https://github.com/rasbt/LLMs-from-scratch). | Chapter Title | Main Code (for quick access) | All Code + Supplementary | |------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------|-------------------------------| | Ch 1: Understanding Large Language Models | No code | No code | | Ch 2: Working with Text Data | - [ch02.ipynb](ch02/01_main-chapter-code/ch02.ipynb) - [dataloader.ipynb](ch02/01_main-chapter-code/dataloader.ipynb) (summary) - [exercise-solutions.ipynb](ch02/01_main-chapter-code/exercise-solutions.ipynb) | [./ch02](./ch02) | | Ch 3: Coding Attention Mechanisms | - [ch03.ipynb](ch03/01_main-chapter-code/ch03.ipynb) - [multihead-attention.ipynb](ch03/01_main-chapter-code/multihead-attention.ipynb) (summary) - [exercise-solutions.ipynb](ch03/01_main-chapter-code/exercise-solutions.ipynb)| [./ch03](./ch03) | | Ch 4: Implementing a GPT Model from Scratch | - [ch04.ipynb](ch04/01_main-chapter-code/ch04.ipynb) - [gpt.py](ch04/01_main-chapter-code/gpt.py) (summary) - [exercise-solutions.ipynb](ch04/01_main-chapter-code/exercise-solutions.ipynb) | [./ch04](./ch04) | | Ch 5: Pretraining on Unlabeled Data | Q1 2024 | ... | | Ch 6: Finetuning for Text Classification | Q2 2024 | ... | | Ch 7: Finetuning with Human Feedback | Q2 2024 | ... | | Ch 8: Using Large Language Models in Practice | Q2/3 2024 | ... | | Appendix A: Introduction to PyTorch | - [code-part1.ipynb](appendix-A/03_main-chapter-code/code-part1.ipynb) - [code-part2.ipynb](appendix-A/03_main-chapter-code/code-part2.ipynb) - [DDP-script.py](appendix-A/03_main-chapter-code/DDP-script.py) - [exercise-solutions.ipynb](appendix-A/03_main-chapter-code/exercise-solutions.ipynb) | [./appendix-A](./appendix-A) | | Appendix B: References and Further Reading | No code | | | Appendix C: Exercises | No code | | > [!TIP] > Please see [this](appendix-A/01_optional-python-setup-preferences) and [this](appendix-A/02_installing-python-libraries) folder if you need more guidance on installing Python and Python packages. Shown below is a mental model summarizing the contents covered in this book. <img src="images/mental-model.jpg" width="600px">

评论收藏

内容反馈

版权申诉