All tutorials
Track 23ยทData & Types

Fetched data stays in Python, not the prompt

The agent downloads full Wikipedia articles and analyses them as Python variables. The text never re-enters the model context, whether the article is 30,000 characters or 80,000.

intermediate10 min
Video coming soon
Browse this tutorial's folder in tutorials-pygithub.com/OpenSymbolicAI/tutorials-py/tree/main/23-wikipedia

Track 22 showed that a list generated by one primitive reaches the next as a Python variable. This track applies the same idea to real-world data: full Wikipedia articles downloaded at plan execution time. The model writes the plan before any article is fetched. When the plan runs, each fetch() call stores the article text in a variable. Everything after that operates on those variables in pure Python. The article content never touches the model.

The agent: WikipediaAnalyst#

The agent exposes a fetch primitive and a set of text-analysis primitives.

python
# analyst.py
import json, re, urllib.parse, urllib.request
from collections import Counter
from opensymbolicai.blueprints import PlanExecute
from opensymbolicai.core import primitive

_TITLES = {
    "python": "Python (programming language)",
    "javascript": "JavaScript",
    "artificial intelligence": "Artificial intelligence",
    "computer science": "Computer science",
    "alan turing": "Alan Turing",
    # ... more disambiguation entries
}

def _fetch_wikipedia(topic: str) -> str:
    headers = {"User-Agent": "osai-tutorial/1.0"}
    title = _TITLES.get(topic.lower().strip())
    if title is None:
        search_url = (
            "https://en.wikipedia.org/w/api.php?action=query&list=search"
            f"&srsearch={urllib.parse.quote(topic)}&format=json&srlimit=1"
        )
        req = urllib.request.Request(search_url, headers=headers)
        with urllib.request.urlopen(req, timeout=10) as resp:
            results = json.loads(resp.read().decode("utf-8"))
        hits = results.get("query", {}).get("search", [])
        title = hits[0]["title"] if hits else topic

    extract_url = (
        "https://en.wikipedia.org/w/api.php?action=query&prop=extracts"
        "&explaintext=1&exsectionformat=plain"
        f"&titles={urllib.parse.quote(title)}&format=json"
    )
    req = urllib.request.Request(extract_url, headers=headers)
    with urllib.request.urlopen(req, timeout=10) as resp:
        data = json.loads(resp.read().decode("utf-8"))
    pages = data.get("query", {}).get("pages", {})
    page = next(iter(pages.values()))
    return page.get("extract", "")


class WikipediaAnalyst(PlanExecute):

    def __init__(self, **kwargs) -> None:
        super().__init__(**kwargs)
        self._cache: dict[str, str] = {}

    @primitive(read_only=True)
    def fetch(self, topic: str) -> str:
        """Fetch the Wikipedia article text for a topic (e.g. 'Alan Turing')."""
        if topic not in self._cache:
            self._cache[topic] = _fetch_wikipedia(topic)
        return self._cache[topic]

    @primitive(read_only=True)
    def word_count(self, text: str) -> int:
        """Number of words in the text."""
        return len(text.split())

    @primitive(read_only=True)
    def count_mentions(self, text: str, keyword: str) -> int:
        """Number of times keyword appears in text (case-insensitive)."""
        return len(re.findall(re.escape(keyword.lower()), text.lower()))

    @primitive(read_only=True)
    def most_common_words(self, text: str, n: int = 5) -> list[str]:
        """Top n most frequent words in the text."""
        words = re.findall(r"[a-z]+", text.lower())
        return [word for word, _ in Counter(words).most_common(n)]

    @primitive(read_only=True)
    def sentiment_score(self, text: str) -> float:
        """Ask the model to score the sentiment of text from -1.0 to +1.0."""
        prompt = (
            "Rate the overall sentiment of the following text on a scale from "
            "-1.0 (very negative) to +1.0 (very positive). "
            "Reply with only a single decimal number, nothing else.\n\n"
            f"{text}"
        )
        response = self._llm.generate(prompt)
        try:
            return round(float(response.text.strip()), 4)
        except ValueError:
            return 0.0

    @primitive(read_only=True)
    def sentiment_label(self, score: float) -> str:
        """Map a sentiment score to 'positive', 'neutral', or 'negative'."""
        if score > 0.1:
            return "positive"
        if score < -0.1:
            return "negative"
        return "neutral"

    @primitive(read_only=True)
    def concat(self, a: str, b: str) -> str:
        """Concatenate two texts."""
        return a + " " + b

    @primitive(read_only=True)
    def compare_counts(
        self, label_a: str, count_a: int, label_b: str, count_b: int
    ) -> str:
        """Return a formatted comparison: 'Python (1,247) > JavaScript (891)'."""
        if count_a > count_b:
            return f"{label_a} ({count_a:,}) > {label_b} ({count_b:,})"
        elif count_b > count_a:
            return f"{label_b} ({count_b:,}) > {label_a} ({count_a:,})"
        return f"{label_a} = {label_b} ({count_a:,})"

The model sees only the signatures. It knows fetch takes a topic: str and returns str. It does not know how large that string will be, and it does not see the content when the plan executes.

Run four tasks#

python
# main.py
from analyst import WikipediaAnalyst
from opensymbolicai.llm import LLMConfig

TASKS = [
    "Which has a longer Wikipedia article: Python or JavaScript?",
    "Does 'algorithm' appear more in the Artificial Intelligence article or the Computer Science article?",
    "What are the 5 most common words across the Python and JavaScript articles combined?",
    "What is the sentiment of the Alan Turing Wikipedia article?",
]

llm = LLMConfig(provider="ollama", model="qwen2.5-coder:7b")

for task in TASKS:
    agent = WikipediaAnalyst(llm=llm)
    result = agent.run(task)
    print(f"Task:   {task}")
    print(f"Result: {result.result}")
    print(f"Plan:")
    for line in result.plan.splitlines():
        print(f"  {line}")
    print()
bash
uv run main.py

Output:

text
Task:   Which has a longer Wikipedia article: Python or JavaScript?
Result: Python (6,143) > JavaScript (5,079)
Plan:
  python_text = fetch(topic="Python")
  javascript_text = fetch(topic="JavaScript")
  python_word_count = word_count(text=python_text)
  javascript_word_count = word_count(text=javascript_text)
  return compare_counts("Python", python_word_count, "JavaScript", javascript_word_count)

Task:   Does 'algorithm' appear more in the Artificial Intelligence article or the Computer Science article?
Result: AI (25) > CS (12)
Plan:
  ai_article = fetch(topic='Artificial Intelligence')
  cs_article = fetch(topic='Computer Science')
  ai_count = count_mentions(text=ai_article, keyword='algorithm')
  cs_count = count_mentions(text=cs_article, keyword='algorithm')
  return compare_counts('AI', ai_count, 'CS', cs_count)

Task:   What are the 5 most common words across the Python and JavaScript articles combined?
Result: ['the', 'a', 'and', 'to', 'in']
Plan:
  python_text = fetch(topic="Python")
  javascript_text = fetch(topic="JavaScript")
  combined_text = concat(a=python_text, b=javascript_text)
  return most_common_words(text=combined_text, n=5)

Task:   What is the sentiment of the Alan Turing Wikipedia article?
Result: positive
Plan:
  article_text = fetch(topic="Alan Turing")
  score = sentiment_score(text=article_text)
  return sentiment_label(score=score)

The plan for each task is generated before any article is fetched. Wikipedia does not exist yet when the model writes fetch(topic="Python"). The model knows the primitive takes a string topic and returns a string; that is enough to plan with.

How the data moves#

text
OSAI -- articles as Python variables
----------------------------------------------------------
  fetch("Artificial Intelligence")    fetch("Computer Science")
            |                                  |
            |  84,083 chars                    |  29,882 chars
            |  (Python namespace)              |  (Python namespace)
            v                                  v
  count_mentions(ai_article, "algorithm")    count_mentions(cs_article, "algorithm")
            |                                  |
            +--------------+-------------------+
                           v
          compare_counts("AI", 25, "CS", 12)  -->  "AI (25) > CS (12)"
----------------------------------------------------------
  model sees: 5 lines of code, 0 chars of article text


Tool-calling loop -- articles through the context window
----------------------------------------------------------
  fetch("Artificial Intelligence")
            |
            v
  context: "Artificial intelligence (AI) is the simulation of...
            ... (84,083 chars)"         <- model reads the full article
            |
            v
  context: (AI article) + "Computer science is the study of...
            ... (29,882 chars)"         <- +29,882 chars more
            |
            v
  count_mentions("Artificial intelligence...", "algorithm")  <- full article again
  count_mentions("Computer science...", "algorithm")         <- full article again
----------------------------------------------------------
  model sees: 113,965 chars of text, multiple times over

The sentiment_score exception#

sentiment_score makes a direct model call from inside the primitive:

python
@primitive(read_only=True)
def sentiment_score(self, text: str) -> float:
    prompt = "Rate the overall sentiment ... \n\n" + text
    response = self._llm.generate(prompt)
    ...

This is a model call, but it happens inside a primitive during plan execution, not during plan generation. The planning step still never sees the article text. The plan just contains score = sentiment_score(text=article_text). The full article is passed to the model only when that line runs.

Primitive reference#

PrimitiveWhat it does
fetch(topic)Download a full Wikipedia article by topic name
word_count(text)Count words
char_count(text)Count characters
unique_word_count(text)Count distinct words
count_mentions(text, keyword)Count occurrences of a keyword
most_common_words(text, n)Top n most frequent words
sentiment_score(text)Score sentiment from -1.0 to +1.0 (model call inside primitive)
sentiment_label(score)Map score to positive / neutral / negative
concat(a, b)Join two texts
compare_counts(label_a, count_a, label_b, count_b)Format a comparison string

What to notice#

  • The plan is the same shape regardless of article length. fetch("Python") returns 40,000 characters or 80,000 characters depending on the article. The plan line is identical either way.
  • Multiple fetches pipe independently. The comparison tasks fetch two articles. Each lands in its own variable; neither interferes with the other.
  • sentiment_score is the only model call after planning. Read the output for the Alan Turing task and compare its plan to the others. The plan is three lines: fetch, score, label. The model call for scoring happens when sentiment_score executes, not when the plan is written.