Voice date agent

Speak a date question out loud. The agent transcribes it with Whisper, computes the answer with a calendar primitive, and reads the result back with the macOS say command.

No date is hardcoded anywhere in the agent. Every answer is computed from the current date at runtime.

Why not just the LLM?#

LLMs are unreliable at date arithmetic. Ask one "what date is the last Sunday of next month?" and it will give you a plausible-sounding answer that is often wrong. The calculation involves knowing the current date, advancing a month, finding the last day, and walking backward to the right weekday. Small errors compound.

The agent here never asks the LLM to do arithmetic. The LLM writes a plan that calls add_time, last_weekday_of_month, and days_between. The Python functions do the math. The LLM only decides which primitives to call and in what order.

There is a second reason to use primitives: the LLM does not know today's date. DateContext injects it before the first iteration, and current_date() is available as a primitive if the agent needs to reference it again. The LLM never guesses.

A new blueprint: GoalSeeking#

The previous tutorials used PlanExecute: the LLM writes the full plan upfront, then every primitive runs. That works well when the plan can be written without knowing intermediate results.

Date questions are different. To answer "how many days until Christmas?" the agent first needs to call current_date() and get the result, then call days_between with that value. A plan written upfront would have to guess today's date.

GoalSeeking solves this with an iterate-until-done loop:

The LLM writes a plan for one iteration.
The plan executes.
An evaluator checks whether the result satisfies the goal.
If not, the accumulated context feeds into the next iteration.

Each iteration sees what the previous one produced. The loop ends when the evaluator is satisfied.

The agent#

DateAgent extends GoalSeeking and defines seven primitives:

python

# date_agent.py
class DateAgent(GoalSeeking):

    @primitive(read_only=True)
    def current_date(self) -> str:
        """Return today's date as YYYY-MM-DD.

        Example: current_date() -> "2026-06-13"
        """

    @primitive(read_only=True)
    def add_time(self, date_str: str, amount: int, unit: str) -> str:
        """Add time to a date. Unit must be 'day', 'week', 'month', or 'year'.

        Example: add_time("2026-06-13", 4, "day")  -> "2026-06-17"
        Example: add_time("2026-06-13", 1, "month") -> "2026-07-13"
        Example: add_time("2026-06-13", -1, "week") -> "2026-06-06"
        """

    @primitive(read_only=True)
    def weekday_of(self, date_str: str) -> str:
        """Return the weekday name for a given YYYY-MM-DD date.

        Example: weekday_of("2026-06-13") -> "Friday"
        """

    @primitive(read_only=True)
    def end_of_month(self, date_str: str) -> str:
        """Return the last day of the month for a given date.

        Example: end_of_month("2026-07-13") -> "2026-07-31"
        """

    @primitive(read_only=True)
    def last_weekday_of_month(self, date_str: str, weekday: str) -> str:
        """Return the last occurrence of a weekday in the month of date_str.

        weekday must be a full English name: 'Monday', ..., 'Sunday'.
        Example: last_weekday_of_month("2026-07-01", "Sunday") -> "2026-07-26"
        """

    @primitive(read_only=True)
    def days_between(self, from_date: str, to_date: str) -> int:
        """Return the number of days from from_date to to_date (YYYY-MM-DD).

        Example: days_between("2026-06-13", "2026-12-25") -> 195
        """

    @primitive(read_only=True)
    def format_response(self, question: str, result: str) -> str:
        """Use an LLM to produce a natural spoken sentence from a date or count.

        Example: format_response("When is next Tuesday?", "2026-06-17")
                 -> "Next Tuesday falls on Tuesday, 17 June 2026."
        """

Goal context#

GoalSeeking uses a context object to accumulate state across iterations:

python

class DateContext(GoalContext):
    today: str = ""   # pre-populated: "2026-06-13 (Friday)"
    answer: str = ""  # set once the agent produces a spoken sentence

today is set when the goal starts, so the agent knows the current date without calling current_date() as a separate step. answer is populated when a plan returns a natural language sentence. The evaluator checks bool(context.answer) and ends the loop as soon as it is non-empty.

A raw date like "2026-07-26" matches the regex [\d\-]+ and does not set answer. A spoken sentence like "The last Sunday of next month will be July 26th, 2026." does not match, so it is stored and the goal is done.

Voice pipeline#

main.py wraps the agent in a voice loop:

python

# main.py (simplified)
from faster_whisper import WhisperModel
import sounddevice as sd
import soundfile as sf

model = WhisperModel("tiny.en", device="cpu")
agent = DateAgent(model="qwen2.5-coder:7b")

def transcribe(audio_path):
    segments, _ = model.transcribe(audio_path)
    return " ".join(s.text for s in segments).strip()

def speak(text):
    os.system(f'say -v Ava "{text}"')   # macOS; pyttsx3 on other platforms

def ask(question):
    result = agent.seek(question)
    return result.final_answer, result.iterations

agent.seek() is the GoalSeeking equivalent of agent.run(). It returns an object with final_answer (the spoken sentence) and iterations (how many planning cycles ran).

Install#

bash

uv add opensymbolicai-core faster-whisper sounddevice soundfile numpy pyttsx3

bash

ollama pull qwen2.5-coder:7b

faster-whisper runs the Whisper model locally. No API key required. pyttsx3 is a fallback TTS for non-macOS systems.

Run it#

Demo mode answers four preset questions without a microphone:

bash

uv run main.py --demo

Live mic input:

bash

uv run main.py

Press Enter to start recording, speak your question, and wait for the spoken reply.

Sample output#

text

Voice Date Agent
========================================

Question: What day is next Tuesday?
  Iteration 1: result = add_time('2026-06-13', 4, 'day')
               return format_response('What day is next Tuesday?', result)
Answer:   Next Tuesday will be June 17th.

Question: When is the last Sunday of next month?
  Iteration 1: result = last_weekday_of_month('2026-07-01', 'Sunday')
               return format_response('When is the last Sunday of next month?', result)
Answer:   The last Sunday of next month will be July 26th, 2026.

Question: How many days until Christmas?
  Iteration 1: d = days_between('2026-06-13', '2026-12-25')
               return d
  Iteration 2: days_until_christmas = days_between(current_date(), '2026-12-25')
               return format_response('How many days until Christmas?', str(days_until_christmas))
Answer:   Christmas is in 195 days!

Question: What date is two weeks from today?
  Iteration 1: result = add_time('2026-06-13', 2, 'week')
               return format_response('What date is two weeks from today?', result)
Answer:   Two weeks from today is June 27th, 2026.

What to notice#

Most questions finish in one iteration. The agent already knows today's date from DateContext.today, so it can compute the answer and call format_response in a single plan. GoalSeeking pays the cost of a loop; good primitives and a pre-loaded context keep that cost to one iteration in practice.

Christmas takes two iterations. The first plan computes days_between and returns the raw number 195. That satisfies no one as a spoken answer, so the evaluator keeps the loop running. The second iteration recognises the raw number in context and wraps it with format_response. This is the loop working as designed: incomplete results feed into the next step.

last_weekday_of_month makes a hard question easy. Without it, the agent would have to walk backward from the end of the month, checking each day's weekday. The LLM gets this wrong reliably. With the primitive, the answer is one function call. The design principle is the same as in Track 33 and Track 36: give the agent a primitive that is exactly as expressive as the problem requires.

format_response is an internal LLM call inside a primitive. After computing a raw date or count, the plan passes it to format_response, which calls self._llm.generate() directly to produce a natural sentence. The plan only sees the string that comes back. The same pattern appeared in Track 38 (select_relevant_keys) and Track 39 (summarize): use an internal LLM call when the transformation is linguistic, not logical.

The voice pipeline is ordinary Python. Whisper, sounddevice, and say are not framework concepts. They wrap the agent call the same way any other I/O wrapper would. Swapping say for a cloud TTS API, or faster-whisper for a streaming ASR service, would not change the agent.