Your first @evaluator and seek()

run() generates a plan and executes it once. GoalSeeking adds a loop: plan, execute, evaluate. If the evaluator says the goal is not met, the loop runs again with the feedback fed back to the model.

The task#

The agent must find a secret number between 1 and 1,000. After each guess it receives a temperature hint and a direction: "cold — go higher", "hot — go lower", "correct", and so on. Each iteration the model picks the midpoint of the remaining range, guesses it, and the evaluator checks if the hint came back as "correct".

The agent#

python

# guesser.py
from opensymbolicai.blueprints import GoalSeeking
from opensymbolicai.core import decomposition, evaluator, primitive
from opensymbolicai.models import ExecutionResult, GoalContext, GoalEvaluation, GoalSeekingConfig

SECRET = 742
GOAL = "Guess the secret number between 1 and 1000."


def _hint(n: int) -> str:
    diff = SECRET - n
    distance = abs(diff)
    direction = "go higher" if diff > 0 else "go lower"
    if distance == 0:   return "correct"
    if distance < 25:   return f"burning — {direction}"
    if distance < 100:  return f"hot — {direction}"
    if distance < 200:  return f"warm — {direction}"
    if distance < 400:  return f"cold — {direction}"
    return f"freezing — {direction}"


class HintContext(GoalContext):
    low: int = 1
    high: int = 1000
    last_hint: str = "no guess yet"


class Guesser(GoalSeeking):
    """An agent that converges on a hidden number using binary search."""

    def __init__(self, **kwargs) -> None:
        cfg = kwargs.pop("config", None) or GoalSeekingConfig(max_iterations=20)
        super().__init__(config=cfg, **kwargs)

    def create_context(self, goal: str) -> HintContext:
        return HintContext(goal=goal)

    def update_context(self, context: HintContext, execution_result: ExecutionResult) -> None:
        for step in execution_result.trace.steps:
            if step.primitive_called == "guess" and step.success:
                n_arg = step.args.get("n") or step.args.get("arg0")
                if n_arg is None:
                    continue
                n = int(n_arg.resolved_value)
                hint = str(step.result_value)
                context.last_hint = hint
                if "go higher" in hint:
                    context.low = max(context.low, n + 1)
                elif "go lower" in hint:
                    context.high = min(context.high, n - 1)

    @primitive(read_only=True)
    def midpoint(self, low: int, high: int) -> int:
        """Return the midpoint of [low, high]."""
        return (low + high) // 2

    @primitive(read_only=True)
    def guess(self, n: int) -> str:
        """Guess n. Returns a temperature hint and direction, e.g. 'hot — go lower'."""
        return _hint(n)

    @decomposition(
        intent="guess the midpoint of [low, high] from context",
        expanded_intent=(
            "Each iteration the context shows updated low and high values. "
            "Substitute those exact integers into midpoint(), then call guess(n)."
        ),
    )
    def _example(self) -> str:
        # context: low=501, high=1000 after a 'cold — go higher' on 500
        n = self.midpoint(501, 1000)
        result = self.guess(n)
        return result

    @evaluator
    def _check(self, goal: str, context: HintContext) -> GoalEvaluation:
        return GoalEvaluation(goal_achieved=context.last_hint == "correct")

Run it#

python

# main.py
from guesser import GOAL, Guesser
from opensymbolicai.llm import LLMConfig

llm = LLMConfig(provider="ollama", model="qwen2.5-coder:7b")
agent = Guesser(llm=llm)
result = agent.seek(GOAL)

for iteration in result.iterations:
    step = next(
        (s for s in iteration.execution_result.trace.steps if s.primitive_called == "guess"),
        None,
    )
    n = int((step.args.get("n") or step.args.get("arg0")).resolved_value) if step else "?"
    hint = str(step.result_value) if step else "?"
    print(f"iteration {iteration.iteration_number:2d}: guess={n:>4}  hint={hint}")

print()
print(f"status:     {result.status.value}")
print(f"iterations: {result.iteration_count}")

bash

uv run main.py

text

iteration  1: guess= 500  hint=cold — go higher
iteration  2: guess= 750  hint=hot — go lower
iteration  3: guess= 625  hint=warm — go higher
iteration  4: guess= 687  hint=warm — go higher
iteration  5: guess= 718  hint=hot — go higher
iteration  6: guess= 734  hint=burning — go higher
iteration  7: guess= 742  hint=correct

status:     achieved
iterations: 7

The three new pieces#

@evaluator marks the method that decides if the goal is done. It receives the goal string and the current context, and returns a GoalEvaluation. Set goal_achieved=True to stop the loop. Set it False and the loop continues.

seek(goal) drives the loop. It returns a GoalSeekingResult with status, iteration_count, and a list of iterations each carrying its own plan, trace, and evaluation.

GoalStatus records how the loop ended. achieved means the evaluator returned True. max_iterations means the limit was hit before that happened (default is 10 for GoalSeekingConfig; the Guesser sets it to 20).

Context and update_context#

HintContext is a GoalContext subclass. Its fields (low, high, last_hint) are injected into the prompt as literals each iteration, so the model substitutes the current range into midpoint(low, high) without needing to read the raw trace itself.

update_context is called after each execution. It reads the hint from the trace and narrows the range. The evaluator and planner never touch ExecutionResult directly: that boundary is enforced by design.