From Prefill to Decode

How modern LLM inference actually works — click through each stage