CSV analyst
An agent answers plain-English questions about a CSV file using real pandas operations. Column names and sample rows are injected into the task so the agent writes correct code in one shot.
Before you start
Ask a data question in plain English, get a prose answer backed by real numbers. The agent writes pandas code against the Titanic passenger dataset, runs it, and converts the result into a readable sentence.
Why not just ask the LLM?#
An LLM can make up plausible statistics. Survival rates, average ages, and median fares stated from memory may be wrong or outdated. Running the pandas operations against the actual file gives exact answers from the real data.
The same pattern from Track 46 applies here: inject the schema up front.
data_context() provides column names, dtypes, and three sample rows before
the first question. With that context the agent writes correct pandas on the
first attempt.
The agent#
Two primitives.
# csv_agent.py
class CsvAgent(PlanExecute):
def __init__(self, df: pd.DataFrame, llm) -> None:
super().__init__(llm=llm)
self._df = df
@primitive(read_only=True)
def get_df(self) -> pd.DataFrame:
"""Return the loaded DataFrame."""
return self._df
@primitive(read_only=True)
def to_string(self, value, question: str) -> str:
"""Convert a pandas result to a readable answer for the given question."""
prompt = (
f"Question: {question}\n\n"
f"Data:\n{value}\n\n"
"Write a concise, plain-English answer to the question using only the data above. "
"No preamble, no markdown."
)
return self._llm.generate(prompt).text.strip()get_df() hands the real DataFrame to the plan. The plan calls pandas methods
directly on the returned object:
df = get_df()
result = df.groupby("pclass")["survived"].mean().round(3)
return to_string(result, "What was the survival rate for each passenger class?")to_string() sends the raw pandas output and the original question to the LLM,
which writes one or two prose sentences.
main.py loads the data once, calls data_context(), and prepends the context
to every question:
df = pd.read_csv(CSV_PATH)
agent = CsvAgent(df=df, llm=llm)
context = agent.data_context()
for question in QUESTIONS:
task = f"Dataset context:\n{context}\n\nQuestion: {question}"
result = agent.run(task)data_context() returns:
Columns:
survived (int64)
pclass (int64)
sex (object)
age (float64)
...
Sample rows (first 3):
survived pclass sex age ...
0 3 male 22.0 ...
1 1 female 38.0 ...
1 3 female 26.0 ...Run it#
uv add opensymbolicai-core pandas
ollama pull qwen2.5-coder:7b
uv run main.pyThe Titanic dataset (~58 KB) is downloaded automatically on first run.
Sample output#
Model : qwen2.5-coder:7b
Dataset: titanic.csv (891 rows, 15 columns)
============================================================
Q: What was the survival rate for each passenger class?
(6.4s plan=710 tok summarize=145 tok total=855 tok)
The survival rate for each passenger class was as follows:
First Class: 63.0%, Second Class: 47.3%, Third Class: 24.2%.
Q: What was the average age of survivors vs non-survivors?
(3.9s plan=776 tok summarize=127 tok total=903 tok)
The average age of survivors was 28.34 compared to 30.63 for non-survivors.
Q: Which sex had a higher survival rate, and by how much?
(2.6s plan=715 tok summarize=130 tok total=845 tok)
Females had a higher survival rate than males by 0.553 (0.742 - 0.189).
Q: What were the top 3 embarkation towns by number of passengers?
(2.7s plan=716 tok summarize=145 tok total=861 tok)
The top 3 embarkation towns were Southampton with 644, Cherbourg with 168,
and Queenstown with 77.
Q: What is the median fare paid by survivors vs non-survivors?
(3.1s plan=756 tok summarize=89 tok total=845 tok)
The median fare paid by survivors was $26.0, while non-survivors paid $10.5.What to notice#
get_df() returns a real DataFrame. The plan calls pandas methods on the
returned object directly: .groupby(), .mean(), .value_counts(). There is
no code string, no exec(), no eval(). The primitives are ordinary Python
functions and the plan is ordinary Python code; the LLM writes method calls,
not strings of code to be evaluated.
Two LLM calls per question. The first generates the plan (plan tokens).
The second runs inside to_string() to convert raw pandas output into a prose
sentence (summarize tokens). Both counts are printed. The plan call is the
larger of the two because it includes the full dataset context; the summarize
call sees only the computed result and the question.
to_string() is the linguistic layer. A groupby result is a Series with
numeric index labels. Turning that into "First Class: 63.0%, Second Class:
47.3%" is a natural language task, not a pandas task. Keeping it in a separate
primitive makes the split explicit: pandas does the computation, the LLM does
the narration.
The decomposition example teaches the plan shape. One @decomposition
shows the three-step pattern: get_df(), compute with pandas, to_string().
That is enough for the LLM to generalize to all five questions, including ones
that use different pandas operations.