Chapter 5: Context & Token Mastery
Model Selection Economics
Model Selection Economics
Choosing the right model is not just about capability -- it is about matching the model to the task. Using Opus for everything is like driving a truck to the grocery store. It works, but it is not the most efficient choice.
The model lineup
| Model | Context Window | Speed | Best For | Relative Cost |
|---|---|---|---|---|
| Haiku 4.5 | 200K tokens | Fastest | Simple questions, subagent tasks | $ |
| Sonnet 4.6 | 1M tokens | Fast | Most coding tasks, refactoring, tests | $$ |
| Opus 4.6 | 1M tokens | Moderate | Complex architecture, multi-file reasoning, debugging | $$$ |
All three models are excellent. The difference is in depth of reasoning and cost per token.
The decision framework
When you are about to send a prompt, ask yourself one question:
-
"Can I describe this in one sentence?" -- Use Sonnet. Examples: "Add a loading spinner to the submit button," "Rename the
itemsvariable totodos," "Write tests for the SearchBar component." -
"Does this need deep reasoning about multiple files?" -- Use Opus. Examples: "Why does this race condition only happen in production?", "Design the data flow for a feature that touches 6 files," "Review the architecture and suggest improvements."
-
"Is this a simple lookup or format change?" -- Use Haiku if available. Examples: "What type does this function return?", "Convert this object to an array," quick subagent tasks that Claude Code dispatches automatically.
A good workflow: use Sonnet for building features, switch to Opus for debugging tricky issues or planning architecture. You can switch with /model anytime without losing your conversation.
Switching models mid-conversation
You can change models at any point in a conversation:
Select a model: Haiku 4.5, Sonnet 4.6, Opus 4.6
When you switch, there is no context loss. The new model picks up where the previous one left off. The only thing that changes is the pricing for tokens going forward.
Token throughput: speed vs. depth
The models differ not just in cost but in how fast they produce output:
Sonnet outputs faster than Opus. For iterative workflows -- try something, review the result, fix an issue, repeat -- Sonnet's speed advantage compounds. Each cycle completes faster, which means more iterations per hour.
Opus reasons more deeply. It considers more possibilities and catches subtleties that Sonnet might miss. For complex debugging or architectural decisions, this deeper reasoning often means fewer total iterations. The "slower" model can actually be faster end-to-end because it gets closer to the right answer on the first try.
| Workflow | Better Model | Why |
|---|---|---|
| Build a component | Sonnet | Fast iteration, straightforward task |
| Debug a race condition | Opus | Needs deep multi-file reasoning |
| Write unit tests | Sonnet | Mechanical, well-defined task |
| Plan a refactoring | Opus | Needs to reason about tradeoffs |
| Rename variables | Sonnet (or Haiku) | Simple, repetitive change |
| Review architecture | Opus | Needs to hold many files in mind |
The Sonnet-first strategy
Here is the approach most experienced Claude Code users follow:
- Start every task with Sonnet
- Build the feature, write the tests, do the refactoring
- If you hit a quality wall -- Claude keeps making the same mistake, or the logic is not quite right -- switch to Opus
- Let Opus solve the hard part
- Switch back to Sonnet for the next task
Most tasks never need step 3. Sonnet handles the vast majority of coding work. But when you do need Opus, you will know -- the task will feel like it requires more "thinking" than "doing."
You do not need to optimize this from day one. Start with whatever model you are comfortable with. As you build intuition about what feels like a "Sonnet task" vs. an "Opus task," you will naturally start switching. The savings add up over time.