I trained a ChatGPT clone for £75
I just watched a machine learn to speak. Over the course of 4 hours, for £75 ($100), I trained my own conversational AI from scratch. Not by using an API or fine-tuning someone else’s model - I mean actually training a language model, watching it go from random noise to something that can hold a conversation.
It’s a strange experience. You’re essentially watching intelligence emerge from mathematics.
The setup
I used nanochat as my foundation - Andrej Karpathy’s “minimal, hackable” implementation of the full LLM training pipeline. The beauty of nanochat is that it’s simple enough to understand but complete enough to actually work. No mysterious or bloated frameworks - just the core process laid bare.
For hardware, I spun up an 8xH100 GPU node on Lambda Labs at $24/hour. Eight top-of-the-line GPUs sitting in a data center somewhere, mine for the next few hours. The same hardware that tech companies use for their production models, just rented by the hour (I didn’t have $200k sitting around to spare).
Stage 1: Pretraining - Teaching it language
The first stage is pretraining. This is where the model learns what language is - the patterns, the structure, the rhythm of how words fit together.
I fed it FineWeb - a massive, carefully filtered dataset of web text compiled by HuggingFace. The model processes billions of tokens, essentially reading a significant chunk of the internet, learning to predict what comes next.
At the start, the model is completely random. It outputs pure gibberish. But then something remarkable happens: you watch the loss curve drop. The model starts finding patterns. First it learns that certain letters follow others. Then it learns words. Then phrases. Then meaning.
You can literally see it in the training metrics - the numbers dropping, the loss decreasing. Each step, it gets a little less confused about what language is.
Stage 2: Finetuning - Teaching it conversation
Once the model understands language, you need to teach it how to use it. This is finetuning.
For this, I used SmolTalk - high-quality conversational data showing the model how to engage in dialogue. The format matters here: the model learns the structure of turn-taking, how to respond appropriately, how to be helpful.
This stage is faster but fascinating to watch. The model already knows language; now it’s learning its role. You’re essentially teaching it personality and purpose.
What you end up with
After those 4 hours, I had a working conversational AI. The total cost? £75 ($100).
Is it GPT-5? Absolutely not. It’s more like talking to a bright kindergartener - sometimes insightful, often confused, occasionally making surprising connections. But the fact that it works at all is remarkable.
The benchmarks tell the story:
Benchmark | Score | What This Measures |
---|---|---|
MMLU | 31.51% | Multitask language understanding |
GSM8K | 4.55% | Grade school math problems |
HumanEval | 8.54% | Python code generation |
ARC-Easy | 38.76% | Science questions (easy) |
ARC-Challenge | 28.07% | Science questions (hard) |
ChatCORE | 8.84% | Conversational reasoning |
These aren’t impressive scores by frontier model standards. But they show something important: the model has learned. It didn’t memorise - it genuinely acquired the ability to reason (albeit at a basic level) about language, science, and code.
Looking at the training metrics is fascinating - you can literally see intelligence taking shape over those few hours.
What makes this possible?
Three things have converged to make this achievable:
Open datasets: Projects like FineWeb and SmolTalk mean you don’t need to spend months scraping and cleaning data. The hard work of curation is done.
Commodity GPU access: Cloud providers are competing fiercely. The same GPUs that would cost hundreds of thousands to buy can be rented for $24/hour.
Educational transparency: People like Karpathy are demystifying the process, showing that training LLMs is not magic - it’s just code and compute.
The real lesson
The technical achievement is interesting, but what struck me most was how accessible this has become. A few years ago, this was the exclusive domain of well-funded research labs. Today, anyone with curiosity and £75 can train their own language model.
We’re at a point where you can experiment with the fundamental technology reshaping our world on a shoestring budget. That feels significant.
Want to chat with the model?
Head to https://huggingface.co/spaces/sdobson/nanochat to give it a try. Or run it on your own machine with docker run -it -p 7860:7860 --platform=linux/amd64 registry.hf.space/sdobson-nanochat:latest python app.py