Coding agent
An agent reads a Python file, rewrites it per an instruction, saves it in place, and runs it to confirm the output is unchanged.
Before you start
An agent that reads a Python file, asks the LLM to rewrite it, saves the result, and runs it to confirm the output matches. Three recursive programs are converted to iterative: Fibonacci, factorial, and binary search.
Why not just prompt the LLM?#
Pasting source code into a chat and asking for a rewrite works once. It breaks down when you have many files, want the result saved automatically, or need confidence the rewrite still produces the correct output.
An agent with read_file, rewrite, write_file, and run_code turns
the same operation into a repeatable plan. The LLM does the transformation;
Python verifies it.
The agent#
Four primitives cover the full read-transform-write-validate cycle:
# coding_agent.py
class CodingAgent(PlanExecute):
@primitive(read_only=True)
def read_file(self, path: str) -> str:
"""Return the contents of a Python source file."""
with open(path, encoding="utf-8") as f:
return f.read()
@primitive(read_only=True)
def rewrite(self, source: str, instruction: str) -> str:
"""Apply instruction to source and return the new Python source code."""
prompt = (
f"{instruction}\n\n"
"Return ONLY the new Python source code, no explanation, no markdown fences.\n\n"
f"Source:\n{source}"
)
raw = self._llm.generate(prompt).text.strip()
# Strip markdown fences if the model adds them anyway
if raw.startswith("```"):
raw = raw.split("\n", 1)[-1]
if raw.endswith("```"):
raw = raw.rsplit("```", 1)[0]
return raw.strip()
@primitive(read_only=False)
def write_file(self, path: str, content: str) -> str:
"""Write content to path."""
with open(path, "w", encoding="utf-8") as f:
f.write(content)
return f"wrote {path}"
@primitive(read_only=True)
def run_code(self, path: str) -> str:
"""Run a Python file and return stdout, or stderr on failure."""
result = subprocess.run(
[sys.executable, path],
capture_output=True, text=True, timeout=10,
)
if result.returncode != 0:
return f"ERROR:\n{result.stderr.strip()}"
return result.stdout.strip()
@primitive(read_only=True)
def respond(self, message: str) -> str:
"""Return message as the final response."""
return messageThe generated plan is the same shape for every task:
source = read_file("programs/fib.py")
updated = rewrite(source, "Convert the recursive function to iterative...")
write_result = write_file("programs/fib.py", updated)
output = run_code("programs/fib.py")
return respond(output)Run it#
uv add opensymbolicai-core
ollama pull qwen2.5-coder:7b
uv run main.pySample output#
Model: qwen2.5-coder:7b
============================================================
programs/fib.py
----------------------------------------
Diff:
--- a/programs/fib.py
+++ b/programs/fib.py
@@ -1,5 +1,8 @@
def fib(n):
if n <= 1:
return n
- return fib(n - 1) + fib(n - 2)
+ a, b = 0, 1
+ for _ in range(2, n + 1):
+ a, b = b, a + b
+ return b
Output (8.8s):
0 1 1 2 3 5 8 13 21 34
programs/factorial.py
----------------------------------------
Diff:
--- a/programs/factorial.py
+++ b/programs/factorial.py
@@ -1,5 +1,5 @@
def factorial(n):
- if n == 0:
- return 1
- return n * factorial(n - 1)
+ result = 1
+ for i in range(2, n + 1):
+ result *= i
+ return result
Output (3.9s):
0! = 1
1! = 1
2! = 2
...
7! = 5040
programs/binary_search.py
----------------------------------------
Diff:
--- a/programs/binary_search.py
+++ b/programs/binary_search.py
@@ -1,13 +1,13 @@
-def binary_search(arr, target, low=0, high=None):
- if high is None:
- high = len(arr) - 1
- if low > high:
- return -1
- mid = (low + high) // 2
- if arr[mid] == target:
- return mid
- elif arr[mid] < target:
- return binary_search(arr, target, mid + 1, high)
- else:
- return binary_search(arr, target, low, mid - 1)
+def binary_search(arr, target):
+ low, high = 0, len(arr) - 1
+ while low <= high:
+ mid = (low + high) // 2
+ if arr[mid] == target:
+ return mid
+ elif arr[mid] < target:
+ low = mid + 1
+ else:
+ high = mid - 1
+ return -1
Output (6.0s):
Array: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
Search 6: index 3
Search 7: index -1
Search 14: index 7What to notice#
write_file is the only mutating primitive. It is marked
read_only=False. The other three primitives only read or compute. This
makes it easy to audit what the agent can change: search for
read_only=False in the class and you have the complete list of side
effects.
run_code closes the loop. The agent executes its own output with the
same Python interpreter. If write_file wrote broken code, run_code
returns the stderr and that string becomes the final answer, making the
failure visible. There is no separate test harness; the program's own
print statements are the assertion.
The diff is computed outside the agent. main.py reads the file before
and after agent.run() and calls difflib.unified_diff. The agent does not
see or produce the diff. This is a useful separation: the agent is
responsible for the transformation; the caller is responsible for recording
what changed.
rewrite strips markdown fences defensively. The prompt says to return
only code, but some model versions add fences anyway. Stripping them in the
primitive means the LLM's formatting habits do not break write_file. Put
cleanup logic in the primitive that needs clean input, not in the plan.
The same four primitives handle any source-level transformation. Add type
hints, rename a function, convert a class to a dataclass, translate between
languages. The instruction passed to rewrite is the only thing that changes.