Some notes on the high level structure of the nanoGPT model

Author: Goran Trlin

nanoGPT repository is a great open source repository for training and fine tuning medium size GPTs. It gained a lot of traction in the AI community, due to its clear and well written code that serves as a great resource for learning about transformers and LLMs.

The nanoGPT repository can be found here: https://github.com/karpathy/nanoGPT

The central part of the repository is the nanoGPT model. It is a transformer-based neural network trained on a text file containing Shakespeare's works. Once trained, the nanoGPT model is able to make its own poetry.

Here are a few short PDF notes about the structure of the nanoGPT model. The notes are about the key ideas and the data flow used in the nanoGPT model.

The notes can be viewed in a PDF located here.

I hope these notes can help a bit with better understanding of transformers in general, and especially of the nanoGPT model.