How They Think An online book about how ChatGPT works Eric Silberstein February 12, 2026 draft Introduction Is that a spelling error? A basic model: predicting feathers Backpropagation The matrix Translation and transformers Tokens Not feathers…what exactly does the model predict? Generating text Deep and nonlinear models: hedgehogs and quills Cracking open the transformer Image recognition and ResNet Transformer block Attention Rotary embeddings The whole model KV cache CORE: How will we know if our model works? Now is it time to train? First: optimizers Adam Muon RMS normalization Multiple GPUs Time to train! Base training. Mid-training Opposable thumbs: the tool dance Supervised fine-tuning Reinforcement learning Hello model! Being precise about precision Select figures Further reading and resources Feedback Acknowledgements TODO