CSV analyst

Ask a data question in plain English, get a prose answer backed by real numbers. The agent writes pandas code against the Titanic passenger dataset, runs it, and converts the result into a readable sentence.

Why not just ask the LLM?#

An LLM can make up plausible statistics. Survival rates, average ages, and median fares stated from memory may be wrong or outdated. Running the pandas operations against the actual file gives exact answers from the real data.

The same pattern from Track 46 applies here: inject the schema up front. data_context() provides column names, dtypes, and three sample rows before the first question. With that context the agent writes correct pandas on the first attempt.

The agent#

Two primitives.

python

# csv_agent.py
class CsvAgent(PlanExecute):

    def __init__(self, df: pd.DataFrame, llm) -> None:
        super().__init__(llm=llm)
        self._df = df

    @primitive(read_only=True)
    def get_df(self) -> pd.DataFrame:
        """Return the loaded DataFrame."""
        return self._df

    @primitive(read_only=True)
    def to_string(self, value, question: str) -> str:
        """Convert a pandas result to a readable answer for the given question."""
        prompt = (
            f"Question: {question}\n\n"
            f"Data:\n{value}\n\n"
            "Write a concise, plain-English answer to the question using only the data above. "
            "No preamble, no markdown."
        )
        return self._llm.generate(prompt).text.strip()

get_df() hands the real DataFrame to the plan. The plan calls pandas methods directly on the returned object:

python

df     = get_df()
result = df.groupby("pclass")["survived"].mean().round(3)
return to_string(result, "What was the survival rate for each passenger class?")

to_string() sends the raw pandas output and the original question to the LLM, which writes one or two prose sentences.

main.py loads the data once, calls data_context(), and prepends the context to every question:

python

df      = pd.read_csv(CSV_PATH)
agent   = CsvAgent(df=df, llm=llm)
context = agent.data_context()

for question in QUESTIONS:
    task   = f"Dataset context:\n{context}\n\nQuestion: {question}"
    result = agent.run(task)

data_context() returns:

text

Columns:
  survived (int64)
  pclass (int64)
  sex (object)
  age (float64)
  ...

Sample rows (first 3):
 survived  pclass     sex   age  ...
        0       3    male  22.0  ...
        1       1  female  38.0  ...
        1       3  female  26.0  ...

Run it#

bash

uv add opensymbolicai-core pandas
ollama pull qwen2.5-coder:7b
uv run main.py

The Titanic dataset (~58 KB) is downloaded automatically on first run.

Sample output#

text

Model  : qwen2.5-coder:7b
Dataset: titanic.csv (891 rows, 15 columns)
============================================================

Q: What was the survival rate for each passenger class?
   (6.4s  plan=710 tok  summarize=145 tok  total=855 tok)
     The survival rate for each passenger class was as follows:
     First Class: 63.0%, Second Class: 47.3%, Third Class: 24.2%.

Q: What was the average age of survivors vs non-survivors?
   (3.9s  plan=776 tok  summarize=127 tok  total=903 tok)
     The average age of survivors was 28.34 compared to 30.63 for non-survivors.

Q: Which sex had a higher survival rate, and by how much?
   (2.6s  plan=715 tok  summarize=130 tok  total=845 tok)
     Females had a higher survival rate than males by 0.553 (0.742 - 0.189).

Q: What were the top 3 embarkation towns by number of passengers?
   (2.7s  plan=716 tok  summarize=145 tok  total=861 tok)
     The top 3 embarkation towns were Southampton with 644, Cherbourg with 168,
     and Queenstown with 77.

Q: What is the median fare paid by survivors vs non-survivors?
   (3.1s  plan=756 tok  summarize=89 tok  total=845 tok)
     The median fare paid by survivors was $26.0, while non-survivors paid $10.5.

What to notice#

get_df() returns a real DataFrame. The plan calls pandas methods on the returned object directly: .groupby(), .mean(), .value_counts(). There is no code string, no exec(), no eval(). The primitives are ordinary Python functions and the plan is ordinary Python code; the LLM writes method calls, not strings of code to be evaluated.

Two LLM calls per question. The first generates the plan (plan tokens). The second runs inside to_string() to convert raw pandas output into a prose sentence (summarize tokens). Both counts are printed. The plan call is the larger of the two because it includes the full dataset context; the summarize call sees only the computed result and the question.

to_string() is the linguistic layer. A groupby result is a Series with numeric index labels. Turning that into "First Class: 63.0%, Second Class: 47.3%" is a natural language task, not a pandas task. Keeping it in a separate primitive makes the split explicit: pandas does the computation, the LLM does the narration.

The decomposition example teaches the plan shape. One @decomposition shows the three-step pattern: get_df(), compute with pandas, to_string(). That is enough for the LLM to generalize to all five questions, including ones that use different pandas operations.