Cookbook: Budget-Aware Workflows
Design pipelines that stop spending when a cost or token budget is reached. This recipe shows how to supply budgets, how Flujo enforces them with quotas, and how to keep loops and parallel branches within limits.
What you'll learn
- Set per-run budgets in code or
flujo.toml. - Add stricter budgets to individual steps.
- Understand how quotas protect loops and parallel branches before they execute.
Choose where the budget comes from
Programmatic budget (code)
from flujo import Flujo, Pipeline, Step
from flujo.agents import make_agent_async
from flujo.domain.dsl.loop import LoopStep
from flujo.domain.models import UsageLimits
summarize = make_agent_async(
"openai:gpt-4o-mini",
"Summarize the latest findings and say whether to stop or continue.",
dict,
)
research = make_agent_async(
"openai:gpt-4o-mini",
"Expand on the summary with one concrete follow-up action.",
dict,
)
loop_body = Pipeline(
steps=[
Step(
"plan",
summarize,
usage_limits=UsageLimits(total_cost_usd_limit=0.10, total_tokens_limit=1500),
),
Step(
"execute",
research,
usage_limits=UsageLimits(total_cost_usd_limit=0.20, total_tokens_limit=2500),
),
]
)
pipeline = Pipeline(
steps=[
LoopStep(
name="research_loop",
loop_body_pipeline=loop_body,
exit_condition_callable=lambda output, ctx: (output or {}).get("action") == "stop",
max_loops=4,
)
]
)
runner = Flujo(
pipeline,
usage_limits=UsageLimits(total_cost_usd_limit=1.00, total_tokens_limit=6000),
)
result = await runner.run_async("Start with a two-sentence summary.")
Central budget (flujo.toml)
[budgets.default]
total_cost_usd_limit = 5.0
total_tokens_limit = 20000
[budgets.pipeline]
"research-*" = { total_cost_usd_limit = 2.5 }
budgets.default.
- Code-provided limits and flujo.toml limits are combined using the most restrictive values.
- See: guides/cost_and_budget_governance.md for the complete policy rules.
Runtime enforcement: Reserve → Execute → Reconcile
- The runner builds a root
Quotafrom the resolvedUsageLimitsand shares it with every step. - Policies estimate step usage, reserve quota before an API call, and raise
UsageLimitExceededErrorif the reservation fails. - After execution, actual usage is reconciled so unused budget is refunded.
- Loops reuse the same quota across iterations; parallel branches receive split quotas so a single branch cannot overspend the run-level budget.
- Pricing must be configured; otherwise
PricingNotConfiguredErrorwill surface before spending.
Checklist for budget-aware pipelines
- Provide
usage_limitswhen creatingFlujoand tighten step-levelusage_limitson the most expensive agents. - Keep loop exit conditions deterministic and review them with quota awareness (see cookbook/safe_loop_budgeting.md).
- For YAML pipelines, place budgets in
flujo.tomlso CLI runs inherit them automatically. - Surface budget failures to operators; the raised
UsageLimitExceededErrorincludes the partialPipelineResultfor debugging and auditing.
See also
- cookbook/safe_loop_budgeting.md — sharing quota across iterations without overruns.
- guides/cost_and_budget_governance.md — centralized budgeting rules and CLI helpers.
- advanced/usage_estimation.md — how estimates are produced for reservations.