Estimate multi-step AI agent pipeline costs. Configure planner, worker, and critic LLMs — see exact cost per run and monthly totals. Free, no signup.
Define your pipeline steps · Enter run volume · Results update live
Typical 3-step agent: planner → worker → critic. Monthly cost for 1,000 runs.
| Pipeline | Step 1 (Planner) | Step 2 (Worker) | Step 3 (Critic) | Cost / 1K runs |
|---|---|---|---|---|
| BudgetCHEAPEST | Flash | GPT-4o mini | GPT-4o mini | ~$0.80 |
| Balanced | GPT-4o mini | GPT-4o | GPT-4o mini | ~$27 |
| High Quality | GPT-4o | Claude Sonnet | GPT-4o | ~$95 |
| Maximum | Claude Sonnet | Claude Sonnet | Claude Sonnet | ~$185 |
Use cheap models (Flash, GPT-4o mini) for planning and verification. Reserve expensive models (GPT-4o, Claude Sonnet) only for the main worker step. This cuts costs 5–10× with minimal quality loss.
Deploy your AI agent infrastructure:
A 3-step agent (planner + worker + critic) with GPT-4o mini costs ~$0.0008 per run. With GPT-4o for the worker step: ~$0.027 per run. For 1,000 runs/month: $0.80 to $27 depending on model choice.
Gemini 1.5 Flash ($0.075/$0.30 per 1M tokens) and GPT-4o mini ($0.15/$0.60) are cheapest. Use Flash for planning/verification steps and GPT-4o or Claude Sonnet only for the main reasoning step.
Use cheap models for routing, planning, and verification. Cache repeated tool calls. Limit max steps per agent run. Use streaming to detect early termination. Set hard token limits per step.
A multi-step LLM workflow where each step uses an LLM call: (1) Planner decides what to do, (2) Worker executes tasks (often with tool use), (3) Critic checks the result. Total cost = sum of all LLM calls per run × number of runs.