Token Management Strategies

You understand tokens and context windows. Now let's put that knowledge to work. Here are the most effective strategies for managing token usage, ranked by impact.

1. Start fresh between tasks

This is the single most impactful habit you can build. When you finish a task -- feature complete, bug fixed, refactoring done -- run /clear before starting the next one.

Terminal

$/clear

Conversation history cleared. Context reset.

Do not carry a sorting conversation into a search feature conversation. The leftover context from the previous task adds noise, costs tokens, and can cause Claude to make connections that are not relevant.

Your CLAUDE.md provides the continuity you need. Every fresh session starts with the same project context, conventions, and patterns.

2. Use /compact proactively

Do not wait for context to degrade. Compact after each major milestone:

Terminal

$/compact focus on the search feature changes

Conversation compacted. Summary focused on search feature changes retained.

The optional focus phrase tells Claude what to prioritize in the summary. This is useful when a conversation has covered multiple topics but you only need to continue one thread.

Good times to compact:

Feature is built but you want to iterate on it
Bug is fixed and you are cleaning up
Design iteration is done, moving to implementation
You notice Claude referencing things from 20 messages ago

3. Choose the right model

Sonnet is roughly 3x cheaper than Opus per token. If you use Sonnet by default and only switch to Opus for complex reasoning, you save significantly over a work session.

Use /model to switch. No context loss, just different pricing going forward. See the previous section for a detailed decision framework.

4. Delegate to subagents

When Claude Code launches an agent to research something -- reading multiple files, searching the codebase, investigating an error -- only the summary returns to your main context. The full research stays in the subagent's context and is discarded when the subagent finishes.

This can save 30-50% of context for investigation-heavy tasks. You get the answer without all the intermediate steps filling up your conversation.

💡Info

You do not need to do anything special to trigger subagents. Claude Code decides automatically when to delegate. But knowing this pattern helps you understand why "research this codebase" does not blow up your context the way you might expect.

5. Disable unused MCP servers

Each MCP server adds tool definitions to your context. If you have a database server, a Slack server, and a GitHub server connected but you are only working on code, the unused servers are consuming tokens for no benefit.

Terminal

$/mcp

Connected MCP servers: github, slack, postgres

Run /mcp to see what is connected. Disconnect servers you are not using in your current workflow.

6. Control extended thinking

Claude Code's thinking effort affects token usage. Lower effort means less internal reasoning and fewer tokens consumed.

/effort low -- For simple tasks like renaming, formatting, or lookups
Default effort -- Fine for most coding work
/effort high -- For complex reasoning, debugging, or architectural planning

Terminal

$/effort low

Effort set to low. Claude will use less thinking for simpler responses.

Match the effort to the task. You do not need deep reasoning for "add a CSS class."

7. Filter verbose output with hooks

When a command produces 1,000 lines of output, Claude sees all of it. Test suites, build logs, and lint reports can dump thousands of tokens into your context.

Hooks can filter command output before it reaches context -- trimming test output to just failures, or build logs to just errors. This is an advanced technique covered in the Claude Code documentation.

⚠️Warning

The #1 mistake intermediate users make is running one long conversation for an entire work session. Short, focused conversations with /clear between tasks produce better results AND cost less.

Try it yourself

Put these strategies into practice right now:

Open Claude Code in your todo-app
Run /cost -- note the starting point
Ask Claude to explain 3 different files
Run /cost again -- notice the increase
Run /compact
Run /cost one more time -- notice the reduced active context
Run /context to see the before/after

This exercise gives you a concrete feel for how quickly tokens accumulate with file reads, and how effectively /compact reclaims context space. Once you have seen the numbers, you will naturally start managing your sessions more intentionally.