CC
0 XP
0

Chapter 5: Context & Token Mastery

Skills & Token Overhead

concept3 min

Skills & Token Overhead

In Chapter 2 you created custom commands like /project:today and /project:status. Those commands are incredibly useful -- but they are not free. Each one has a token cost, and understanding that cost helps you make smart decisions about how many to create.

How skills load into context

When Claude Code starts a session, it does not load every custom command in full. Instead, it uses a two-phase approach:

  1. Skill descriptions load at session start -- Claude Code reads the name and brief description of each custom command. This costs roughly 50-200 tokens per skill, depending on the length of the description.

  2. Full skill content loads on demand -- The actual prompt content inside the command file only loads when you invoke the skill or when Claude determines it is relevant to your current request.

This means having a /project:today command does not dump its entire prompt into your context on every session. Claude Code only loads the full content when you type /project:today or ask something related to daily todos.

Controlling when skills load

For skills that have side effects -- commands that modify files, run deployments, or trigger external services -- you can prevent Claude from invoking them on its own. Add this to the command's frontmatter:

---
disable-model-invocation: true
---

With this flag, the skill is completely hidden from Claude until you manually invoke it with the slash command. This brings its cost to near-zero until you actually use it.

Tip

Custom commands are much cheaper than putting the same instructions in CLAUDE.md. A command only loads its full content when you invoke it. The same text in CLAUDE.md loads every single session.

MCP server tools follow the same pattern

If you have MCP (Model Context Protocol) servers connected to Claude Code, their tools work the same way:

  • Tool names and brief descriptions load at session start
  • Full tool schemas are deferred until Claude determines a tool is needed

Each connected MCP server adds a small amount of baseline overhead. If you have servers you are not actively using, they are still consuming tokens just by being connected.

The tradeoff

More skills means more baseline overhead, but each individual skill is cheap. The math works out like this:

Number of skillsApproximate baseline cost
5 custom commands~500-1,000 tokens
10 custom commands~1,000-2,000 tokens
25 custom commands~2,500-5,000 tokens
50+ custom commands5,000+ tokens

Against a 1M-token context window, even 50 skills is a tiny fraction. But against your per-session token budget, unnecessary overhead adds up over dozens of sessions.

Practical advice

  • 5-10 custom commands is the sweet spot. Enough to cover your key workflows without meaningful overhead.
  • 50+ commands is a signal to audit. Which ones do you actually use? Remove or archive the rest.
  • Run /context to see per-skill token consumption and identify which skills are costing you the most.
  • Use disable-model-invocation: true for any command that should only run when you explicitly ask for it.
💡Info

Think of custom commands like browser tabs. A few open tabs are fine. Thirty open tabs start to slow things down. The overhead is not catastrophic, but it is worth being intentional about what you keep loaded.