All tutorials
Track 29·Reliability

Stopping a batch when the token budget runs out

BudgetedRunner wraps any agent and tracks cumulative token usage. Before each task it checks whether enough tokens remain. If not, it raises BudgetExceeded and the batch stops.

intermediate6 min
Video coming soon
Browse this tutorial's folder in tutorials-pygithub.com/OpenSymbolicAI/tutorials-py/tree/main/29-budget-guard

Track 28 showed how to read token counts from a single result. The one new thing here: a thin wrapper that accumulates those counts across a batch and stops before starting a task it cannot afford.

The wrapper#

python
MIN_PLANNING_TOKENS = 200


class BudgetExceeded(Exception):
    pass


class BudgetedRunner:

    def __init__(self, agent, budget: int) -> None:
        self._agent = agent
        self._budget = budget
        self._used = 0

    @property
    def tokens_remaining(self) -> int:
        return max(0, self._budget - self._used)

    def run(self, task: str):
        if self.tokens_remaining < MIN_PLANNING_TOKENS:
            raise BudgetExceeded(
                f"{self.tokens_remaining} tokens remaining, "
                f"need at least {MIN_PLANNING_TOKENS} to plan"
            )
        result = self._agent.run(task)
        self._used += result.metrics.plan_tokens.total_tokens
        return result

The guard fires before the task runs. A blocked task costs zero tokens. MIN_PLANNING_TOKENS is a floor: set it below your typical input token count so you do not start a task that is likely to fail mid-plan.

Run six tasks against a 1,000-token budget#

python
# main.py
from calc import Calc
from opensymbolicai.llm import LLMConfig

TASKS = [
    "What is 7 + 3?",
    "What is 12 * 15 - 47?",
    "What is 8 factorial?",
    "What is the 10th Fibonacci number?",
    "What is 6 factorial plus the 8th Fibonacci number?",
    "What is (factorial of 5) divided by (fibonacci of 6), then add 12?",
]

llm = LLMConfig(provider="ollama", model="qwen2.5-coder:7b")
runner = BudgetedRunner(Calc(llm=llm), budget=1000)

for task in TASKS:
    try:
        result = runner.run(task)
        print(f"  ok  {task}")
        print(f"      result={result.result}  used={runner._used}  remaining={runner.tokens_remaining}")
    except BudgetExceeded as e:
        print(f"  --  {task}")
        print(f"      BudgetExceeded: {e}")
        break
bash
uv run main.py

Output:

text
Budget: 1000 tokens

  ok  What is 7 + 3?
      result=10  used=442  remaining=558

  ok  What is 12 * 15 - 47?
      result=133  used=902  remaining=98

  --  What is 8 factorial?
      BudgetExceeded: 98 tokens remaining, need at least 200 to plan

  Stopping -- 3 task(s) skipped.

What to notice#

  • The guard checks before, not after. Task 3 is blocked even though it might have fit: 98 tokens remain and the typical input is ~420. The guard does not guess whether a task will succeed; it just enforces the floor.
  • A blocked task costs nothing. The model is never called for task 3. The budget stays at 902 after the exception.
  • The wrapper needs no framework changes. result.metrics.plan_tokens.total_tokens is the only hook into the framework. The rest is plain Python.