June 5, 2026

The Agent Spend Review Template We Use

A lightweight weekly review template for finding the few agent behaviors that explain most of your LLM spend.

Review behavior, not just invoices

An LLM invoice tells you the total. It does not tell you which agent behavior created the total. A spend review should start from calls, tasks, models, and context size.

This is the weekly template we use.

1. Top tasks by spend

List the five most expensive task types for the week. For each one, note total calls, total cost, average input tokens, average output tokens, and model mix.

The goal is not to shame expensive tasks. Some tasks should be expensive. The goal is to spot tasks whose cost does not match their value.

2. Biggest context loads

Find the calls with the largest input token counts. Ask what was loaded and whether the model needed all of it.

Common fixes: summarize older history, fetch narrower file ranges, prune tool definitions, and replace full records with compact metadata.

3. Retry loops

Sort by retry count. Any task with repeated identical failures deserves attention. The fix is usually not "use a smarter model." It is better tool error handling, clearer stop conditions, or a hard retry cap.

4. Model routing drift

Compare this week's model mix with last week's. If expensive models took a larger share, find out why. Was it intentional? Did a fallback trigger more often? Did a prompt change make a smaller model fail?

5. Idle automation

Separate background checks from interactive sessions. Heartbeats and watchers should have a steady profile. Sudden growth means something is running too often, loading too much, or failing repeatedly.

6. One change for next week

Do not end the review with a long list. Pick one change: lower a heartbeat model, cap retries, summarize history, remove an unused tool, or add an alert.

Measure the impact the following week.

The bottom line

Most agent teams do not need a giant FinOps process. They need a 20-minute weekly review that names which behaviors drove spend and fixes one of them.

Clawback is built around that review loop: find the expensive behavior, change it, verify the savings.

See your actual numbers

The calculator runs in your browser. No account needed.

Open Calculator Analyze My Config Per-Task Costs Example Configs