Central takeaway of 'Next Token!': LLMs can't chat, think, or use tools — they are a magical next-token generator. Chat, reasoning, tool calling, agents and structured output are all implemented by code around the model (stop tokens, role conventions, JSON validation loops, tool-call runners). Corollary: there's a lot of software still to write around this new primitive, so developers worried about their jobs should relax.