What's In Your Context Window?

Every time you start a Claude Code session, a bunch of things are loaded into your context window before you even type your first prompt. Understanding this "baseline overhead" is key to managing tokens effectively.

The full picture

Here's everything that can live in your context window, and roughly how many tokens each piece costs:

What	Token cost	When it loads
System prompt	~1-2K tokens	Always (Claude Code's base instructions, cached)
CLAUDE.md files	200-2K+ tokens	Every session start
Auto memory (MEMORY.md)	0-2K tokens	Session start (first 200 lines / 25KB)
Skill/command descriptions	50-200 tokens each	Session start (full content loads on invocation)
MCP tool names	100-500 tokens	Session start (full schemas deferred until use)
Git status	100-500 tokens	Session start (current branch, uncommitted changes)
Your prompts	Varies	As you type them
Claude's responses	Varies	As Claude responds
File reads	Depends on file size	When Claude reads a file
Command output	Depends on verbosity	When Claude runs a command

Total baseline overhead: 1.5-5.5K tokens before you type anything.

That might sound like a lot, but in a 1M-token context window, it's less than 1%. The real context consumers are file reads, long conversations, and verbose command outputs.

💡Info

This is why starting a fresh session with /clear is so powerful -- it resets everything except the baseline overhead. Your CLAUDE.md gives Claude Code all the project context it needs to get right back to work.

Where the tokens actually go

In a typical session, here's how context usage breaks down:

Baseline (system prompt, CLAUDE.md, git status): ~2-4K tokens
Your first prompt: ~50-200 tokens
Claude reads 2-3 files to understand the task: ~1-3K tokens
Claude's response with a plan and code: ~500-2K tokens
Follow-up prompts and responses: accumulates over the session
Command outputs (build errors, test results): can spike to 1-5K tokens each

By message 10-15 in a conversation, you might be at 15-30K tokens. By message 30+, you could be at 50-100K. This is why focused conversations matter.

The smart parts: caching and deferral

Claude Code doesn't load everything at full cost. Two mechanisms keep the baseline efficient:

Prompt caching -- The system prompt and CLAUDE.md are cached across requests. Cached tokens cost roughly 10% of standard pricing, so that 2K-token CLAUDE.md isn't costing you full price on every turn.

Deferred loading -- Skill descriptions and MCP tool schemas are loaded as short summaries at session start. The full content only loads when you actually invoke them. This means having 20 MCP tools configured doesn't blow up your context -- only the ones you use add significant tokens.

✅Tip

Prompt caching helps keep costs down: system prompt and CLAUDE.md are cached across requests, so repeated tokens cost roughly 10% of standard pricing.

What this means for you

The practical takeaway: your context window is mostly consumed by the conversation itself -- the back-and-forth of prompts, responses, file reads, and command outputs. The baseline overhead is small and well-optimized. Focus your token management on keeping conversations short and targeted, not on trimming your system prompt.