Prompts inside primitives
The planning call writes orchestration code. Small, focused model calls inside individual primitives handle judgment: classification, extraction, relevance scoring, rephrasing. Each gets only the data it needs.
Before you start
The planning call writes the orchestration: which primitives to call, in what order, with what arguments. But some problems need a model to solve them: is this text positive or negative? which category does this ticket belong to? what is the answer to this question given this passage?
Those are not orchestration decisions. They are judgment calls. The right place for them is inside a primitive, in a small focused prompt that sees only the data it needs.
The pattern#
@primitive(read_only=True, deterministic=False)
def classify(self, text: str, categories: list[str]) -> str:
"""Classify text into one of the given categories."""
prompt = (
f"Classify the following text into exactly one of these categories: "
f"{', '.join(categories)}.\n"
f"Reply with the category name only.\n\n"
f"{text}"
)
response = self._llm.generate(prompt)
return response.text.strip()Two things to note:
deterministic=Falsetells the framework this primitive calls a model. The same input may not always return the same output.- The prompt is 3 lines. It gets the text and the categories. Nothing else.
The plan treats the return value like any other primitive result:
label = classify(ticket_text, ["bug", "feature", "question"])
route = assign_team(label)
return routeExamples#
Classification#
Route a support ticket, tag a document, label a review.
@primitive(read_only=True, deterministic=False)
def classify(self, text: str, categories: list[str]) -> str:
"""Classify text into one of the given categories."""
prompt = (
f"Pick one category from this list: {', '.join(categories)}.\n"
f"Reply with the category name only.\n\n{text}"
)
return self._llm.generate(prompt).text.strip()Extract answer#
The core pattern for retrieval-augmented generation. The plan fetches a passage; this primitive pulls the answer out of it.
@primitive(read_only=True, deterministic=False)
def extract_answer(self, question: str, passage: str) -> str:
"""Answer a question using only the given passage."""
prompt = (
f"Answer the question using only the passage below.\n"
f"If the answer is not in the passage, reply 'not found'.\n\n"
f"Question: {question}\n\n"
f"Passage: {passage}"
)
return self._llm.generate(prompt).text.strip()Relevance score#
Score how well a passage answers a query. The plan can sort or filter by score, or stop fetching once the score drops below a threshold.
@primitive(read_only=True, deterministic=False)
def relevance(self, query: str, passage: str) -> float:
"""Score how relevant the passage is to the query, from 0.0 to 1.0."""
prompt = (
f"Rate how relevant the following passage is to the query on a scale "
f"from 0.0 (not relevant) to 1.0 (directly answers it).\n"
f"Reply with a single decimal number only.\n\n"
f"Query: {query}\n\nPassage: {passage}"
)
try:
return round(float(self._llm.generate(prompt).text.strip()), 2)
except ValueError:
return 0.0Rephrase#
Rewrite text in a different tone. The plan can chain this after any primitive that returns a string.
@primitive(read_only=True, deterministic=False)
def rephrase(self, text: str, tone: str) -> str:
"""Rewrite text in the given tone (e.g. formal, casual, concise)."""
prompt = (
f"Rewrite the following text in a {tone} tone.\n"
f"Return only the rewritten text.\n\n{text}"
)
return self._llm.generate(prompt).text.strip()Entity extraction#
Pull names, places, organizations, or any other entity type out of text. Returns a list the plan can loop over, deduplicate, or pass to a lookup.
@primitive(read_only=True, deterministic=False)
def extract_entities(self, text: str, entity_type: str) -> list[str]:
"""Extract all entities of a given type from the text."""
prompt = (
f"List all {entity_type} mentioned in the text below.\n"
f"One per line. If none, reply with an empty line.\n\n{text}"
)
lines = self._llm.generate(prompt).text.strip().splitlines()
return [l.strip() for l in lines if l.strip()]Intent detection#
Classify what a user wants before deciding what to do. Useful as a routing step at the top of a plan.
@primitive(read_only=True, deterministic=False)
def detect_intent(self, message: str, intents: list[str]) -> str:
"""Detect the user's intent from a fixed list."""
prompt = (
f"What does the user want? Choose one: {', '.join(intents)}.\n"
f"Reply with the intent only.\n\n{message}"
)
return self._llm.generate(prompt).text.strip()Claim verification#
Check whether a source supports a claim. Used in fact-checking pipelines after retrieval.
@primitive(read_only=True, deterministic=False)
def verify_claim(self, claim: str, source: str) -> bool:
"""Return True if the source supports the claim."""
prompt = (
f"Does the source below support the following claim?\n"
f"Reply with yes or no only.\n\n"
f"Claim: {claim}\n\nSource: {source}"
)
return self._llm.generate(prompt).text.strip().lower().startswith("yes")What these have in common#
Every example above follows the same shape:
- A tight prompt: one instruction, the minimum data needed, a constrained output format ("reply with the category name only", "return a number only").
deterministic=Falseon the decorator.- A return type the plan can use directly:
str,float,bool,list[str].
The planning call never sees the text, the passage, or the claim. It sees the return value and decides what to do next.
What you avoid#
A traditional approach puts everything in one system prompt: "You are a support agent. Classify tickets. Extract answers. Verify facts. When you see a ticket..." When that breaks, you cannot tell which part failed.
Here each judgment call is one primitive. You can test classify on its own,
swap its prompt without touching anything else, and read exactly which step in
the trace produced a bad result.