March 8, 2025

9,600 tokens just to say hello

How OpenClaw context loading works, why it costs more than you think, and what to do about it.

Every time you send a message to your OpenClaw agent, something happens before it reads a single word you typed. The system loads context. A lot of it.

Here is what gets loaded on every single turn:

  • SOUL.md — persona and tone instructions
  • AGENTS.md — workspace behavior rules
  • MEMORY.md — long-term memory (curated over time)
  • USER.md — info about you
  • HEARTBEAT.md — periodic task checklist
  • TOOLS.md — local environment notes
  • Skill descriptions — all installed skills, injected as XML
  • Runtime context — session info, capabilities, time zone
  • System prompt — base instructions from OpenClaw

Count those up on a typical install. SOUL.md runs about 1,200 tokens. AGENTS.md is another 2,000. MEMORY.md, if you have been using the agent for a few weeks, is probably 1,500 tokens. USER.md is 300. HEARTBEAT.md is 200. TOOLS.md is 400. Skill descriptions for 8-10 installed skills add another 1,500. Runtime context and system prompt add roughly 2,500 more.

Total: approximately 9,600 tokens. Before you say anything.

The math

Claude Opus 4 is priced at $15 per million input tokens.

9,600 tokens at $15/MTok = $0.144 per turn in context loading alone.

At 50 messages per day:

  • 50 turns * $0.144 = $7.20/day
  • $7.20 * 30 = $216/month

That is $216/month just to load the same files over and over. Not counting what the model actually does with your message. Not counting output tokens. Not counting heartbeats or sub-agents.

Context loading is a fixed cost per turn. It does not matter if you say "hello" or write a 500-word prompt. The overhead is the same.

Why it is so high

OpenClaw is stateless by design. Every API call is independent. The model has no memory between turns unless you inject that memory explicitly. So OpenClaw injects everything, every time.

This is the right design. It is also expensive.

The files were written for humans to read. AGENTS.md, for example, has extensive explanations, examples, and nuance. All of that ends up in every prompt. A file that takes 30 seconds to write can cost you $50/month if it is verbose enough.

What you can actually do about it

1. Keep workspace files lean.

Go look at your MEMORY.md right now. How long is it? Anything over 1,000 tokens is adding $0.015 per message in overhead. Same for AGENTS.md. Same for SOUL.md. Trim the redundant parts. Use bullet points instead of paragraphs. Delete anything the agent does not actually need at runtime.

2. Use light-context for cron jobs.

Scheduled tasks and cron jobs do not need your full workspace context. They are usually one-shot tasks: "summarize the inbox," "check if the build passed," "send the daily report." Configure these to run with minimal or no workspace injection. OpenClaw supports this through the light-context flag on cron sessions.

3. Audit which files are actually needed.

HEARTBEAT.md is loaded every message. But does it need to be? If your heartbeat checklist is empty or rarely changes, you are paying to inject an empty file thousands of times a month. Same question for USER.md. If your agent does not make decisions based on your timezone or email address, trim it.

4. Consider which model you are using.

At Claude Sonnet 4 ($3/MTok input), that 9,600-token context load costs $0.029 per turn instead of $0.144. That is a 5x reduction for the same context. For workloads where Sonnet is sufficient, switching the default model is the highest-leverage change you can make.

The compound effect

Context loading costs compound with every other cost. Heartbeats load the same context. Sub-agents spin up their own context windows. Multiple channels each run their own sessions.

If you have 2 channels and 48 heartbeats per day, that is 48 heartbeat turns per channel, each loading ~6,000 tokens of context. That is 576,000 extra input tokens per day just from heartbeat context. On Opus: $8.64/day, $259/month, from heartbeats alone.

The lever is the same in all cases: smaller context = lower cost everywhere.

A practical starting point

Run this audit:

  1. Check the size of each workspace file: wc -c ~/.openclaw/workspace/*.md
  2. For any file over 3KB, ask whether every sentence earns its place
  3. Move long-form memory into separate files and reference them only when relevant, not every turn
  4. Set heartbeat and cron tasks to use a lighter model (Haiku works fine for most checks)

Most users can reduce their context load by 40-50% in 20 minutes of cleanup. On Opus at 50 messages/day, that is $100+/month you stop burning.

Use the Clawback calculator to see what your specific setup is costing you.

See your actual numbers

The calculator runs in your browser. No account needed.