Properties

category: reference
tags: [agents, irc, mcp, architecture]
last_updated: 2026-03-16
confidence: medium

Agent IRC Architecture

An architecture for multi-agent coordination over IRC, where a human (the PM) and AI agents share a message bus. The goal is to externalize the coordination layer that currently lives inside Claude Code's context window, so that agents preserve context for their actual work, the human can participate from a phone or terminal, and the whole system is observable by just reading the chat.

The problem

The current agent workflow runs everything inside a single Claude Code session tree. The orchestrator dispatches managers via Task, managers dispatch workers, questions relay back up the chain. This works, but:

The orchestrator's context fills up relaying messages it doesn't need to reason about.
The human is behind a three-hop relay for every question (worker → manager → orchestrator → human → back). You have to be at the terminal.
There's no way to observe what agents are doing without being the orchestrator. No dashboard, no logs, no lurking.
You can't intervene in a task without going through the orchestrator.

The idea

Put everyone on IRC. The PM, the EM, the workers — all peers on a shared message bus. The coordination protocol moves from in-process function calls to channel messages. Everything is observable by joining the channel. The human can participate from a phone IRC client, a terminal, or both.

Org structure

PM (you) — sets priorities, answers product questions, makes scope decisions. Hangs out in #project-{slug}. Doesn't manage the sprint — that's the EM's job. Can peek into any channel but mostly watches the project channel for decisions that need input.

PM delegate — optionally, a Claude.ai session connected to the IRC MCP, acting as the PM's mouthpiece when the PM is AFK. The PM talks to Claude.ai from their phone or browser; Claude.ai speaks on the PM's behalf in the channels. This sidesteps the mobile IRC client problem entirely — the PM's interface is whatever Claude.ai runs on.

EM (coordinator) — a long-running Claude Code SDK session (Opus) that runs the team. Breaks down requirements into tasks, assigns work, tracks progress, makes implementation decisions, surfaces product questions to the PM (or PM delegate). Lives in #project-{slug} and #standup-{slug}. Shields the PM from implementation noise.

Managers — Claude Code SDK sessions (Opus) that own individual tasks. Follow the proceed workflow: plan, implement, test, review, fix, document. Each manager gets a #work-{task-id} channel for its workers. Reports status and completions to #standup-{slug}.

Workers — Claude Code SDK sessions (Sonnet/Haiku) dispatched by managers for specific jobs: implementation, testing, review, documentation. Operate in #work-{task-id} channels. Disposable — when context fills up, they summarize and exit.

The EM decides what needs the PM's input vs. what it can handle itself. Rule of thumb: anything that changes scope, user-facing behavior, or architecture goes to #project-{slug}. Anything that's purely implementation strategy, the EM decides. The EM should also push back on the PM when something is technically inadvisable, just like a real EM would.

Channel topology

All channels exist on a single IRC server. Multiple projects share the server, namespaced by slug.

#project-{slug} — PM + EM coordination. Product decisions, priority calls, scope questions. Low traffic, high signal.
#standup-{slug} — EM + managers. Task assignments, status updates, completion reports. The sprint board. PM can lurk here if they want more detail.
#work-{task-id} — manager + workers for a specific task. Implementation discussion, test results, review feedback. Noisy and disposable. Created when a task starts, abandoned when it completes.
#errors — dead-letter channel. Any agent that hits an unrecoverable failure posts here. Monitored by the EM and optionally by the PM.

Agent naming

Agents get human names, not mechanical identifiers. A conversation between schuyler, Harper, and Dinesh is immediately readable. A conversation between em-robot, mgr-e2-cdn, and worker-3 is a SCADA dashboard.

Names also help with the shift-change problem. When Ramona hits context exhaustion and hands off to Jules, that's a legible event — new person joined, picked up the thread. If worker-3 gets replaced by another worker-3, it's invisible, and that invisibility is exactly the kind of thing that causes confusion.

A names file (names.txt, one per line) lives in the repo. The supervisor pops a name off the list when spawning a process and passes it as the IRC nick. The name also goes into the agent's system prompt so it knows who it is. Names are not reused within a session — once Ramona exits, that name is retired until the list resets.

The EM gets a persistent name that doesn't rotate — it's the one constant in the channel. Think of it as the team lead who's always there. Managers and workers get fresh names each time they're spawned.

Transport abstraction

IRC is the first backend, but the architecture shouldn't be welded to it. A thin transport interface keeps options open:

class Transport(Protocol):
    async def send(self, channel: str, message: str, sender: str) -> None: ...
    async def read(self, channel: str, since: datetime | None = None) -> list[Message]: ...
    async def create_channel(self, name: str) -> None: ...
    async def list_channels(self) -> list[str]: ...
    async def get_members(self, channel: str) -> list[str]: ...

@dataclass
class Message:
    channel: str
    sender: str
    text: str
    timestamp: datetime

The IRC implementation wraps an async IRC client library (bottom or irc). A Zulip or Matrix implementation could be swapped in later — Zulip's topic-per-stream model maps particularly well (stream = project, topic = task).

MCP bridge

A FastMCP server wraps the transport and exposes tools to agents. This is the only interface agents use — they never touch IRC directly.

Design principle: conversational, not structured IPC

A core goal of this architecture is that a human can join any channel and immediately understand what's happening. If agents are posting JSON blobs, the channels are just as opaque as Claude Code's Task tool — you've traded one black box for a noisier one.

Agents communicate in natural language. The EM assigns a task by saying so in plain English. The manager reports a plan the same way. The PM can read #standup-{slug} on their phone and immediately follow the state of the sprint without parsing anything.

The only concession to machine-parseability is lightweight conventions for the supervisor — the EM prefixes task assignments with TASK: so the supervisor can pattern-match without NLP. Everything else is natural language.

Tools

Tool	Description
`send_message(channel, text)`	Post a message to a channel.
`read_messages(channel, since?, limit?)`	Read recent messages from a channel. Returns newest-first.
`create_channel(name)`	Create a new channel (used by EM when spinning up task channels).
`list_channels()`	List active channels.
`get_members(channel)`	List who's in a channel.

That's it. No post_task, claim_task, poll_for_task. Task assignment, claiming, and completion are conversational acts, not structured API calls. The EM says "do this," the manager says "on it," the manager says "done."

Task state is tracked by the EM reading channel history and reasoning about it, not by a state machine. This is less reliable than a database but vastly more observable and simpler to build. If it breaks, you can see exactly where it broke by reading the channel.

Message length

ergo supports the IRCv3 maxline capability, allowing messages up to ~8KB (vs. the traditional 512-byte limit). The MCP bridge should negotiate maxline on connect. For messages that still exceed the limit, the bridge chunks transparently — agents don't need to worry about it.

Configuration

TRANSPORT_TYPE=irc
IRC_SERVER=<proxmox-host-ip>
IRC_PORT=6667
IRC_NICK=mcp-bridge
MCP_PORT=8090

The MCP server maintains a single IRC connection and multiplexes tool calls from multiple agents. Agents identify themselves via a sender parameter so messages get the right nick attribution.

Shared state: robot.wtf wiki

IRC is ephemeral conversation. Durable shared state lives on a robot.wtf project wiki, accessed by agents via the wiki MCP.

Sprint state — EM writes current task assignments, status, and blockers to a project wiki page. Survives EM shift-changes without replaying IRC history.
Handoff summaries — outgoing agents write their handoff doc to the wiki. Incoming agent reads it as initial context.
Task specs and plans — managers write plans to wiki pages before implementing. PM can review from anywhere.
Decision log — architectural decisions, scope changes, PM rulings captured durably.

This means agents get two MCP servers in their config: the IRC bridge for communication, and robot.wtf for persistent shared state. IRC is the conversation, the wiki is the record.

EM recovery after a crash or shift-change: read the project wiki pages to reconstruct sprint state. No dependence on IRC chathistory depth.

Agent lifecycle: long-running with shift-changes

Agents are long-running Claude Code SDK sessions. They persist across tasks, preserving context — a worker that just finished refactoring the auth module still has that code in context when the next auth-related task comes in.

Why the SDK, not the CLI

The Claude Code CLI is designed for a human at a terminal — prompt handling, display rendering, and keybindings are all overhead when the consumer is a daemon. The Claude Code SDK (claude_agent_sdk, ClaudeSDKClient) provides programmatic conversation management without that overhead.

Corrected finding (2026-03-19): An earlier version of this spec noted that the SDK did not include Claude Code's tool definitions and that agents would start with a blank slate. This was wrong. ClaudeSDKClient includes the full Claude Code toolset — Read, Edit, Bash, Glob, Grep, and all other tools available in interactive Claude Code — without any extra configuration.

This removes the "which approach" design question entirely:

Agents get the same coding capabilities as interactive Claude Code.
No need to write custom tool definitions or shell out to the CLI for actual coding work.
--append-system-prompt injects agent-specific instructions (role, name, channel assignments) on top of the existing Claude Code defaults.

Session state is maintained in-process across query() calls — no subprocess is spawned per turn. Sessions can also be resumed across process restarts via session ID, and forked to branch conversations. These properties directly improve shift-change handling (see below).

Polling and dispatch

Two-layer approach:

Layer 1 — Supervisor push (primary). The supervisor watches IRC directly for TASK: prefixes and immediately calls client.query() on the target agent's session with a "check your channels" prompt. Near-zero dispatch latency.

Layer 2 — Polling (fallback/health). A flat 30-second heartbeat: the supervisor calls client.query() on idle agents with a brief check-in prompt. This catches messages the supervisor missed (e.g., during its own restart) and confirms the agent process is alive. Add backoff (30s → 60s → 120s cap) later if idle token costs warrant it.

Response iteration

Supervisor code that drives an agent turn:

await agent.client.query(prompt)
async for message in agent.client.receive_response():
    if isinstance(message, AssistantMessage):
        agent.last_active = datetime.utcnow()
    elif isinstance(message, ResultMessage):
        agent.session_id = message.session_id
        agent.input_tokens_total += message.usage.input_tokens
        agent.cost_usd_total += message.cost_usd
        break

Multiple agents run concurrently via asyncio.gather() or by scheduling each agent's loop as a separate task.

Idle detection

A Haiku-class ClaudeSDKClient call (single turn, no session state needed):

async with ClaudeSDKClient(options=haiku_options) as checker:
    await checker.query(
        f"Here is the last 5 minutes of IRC activity for agent '{nick}'. "
        f"Is it idle? Respond with only: yes or no.\n\n{activity}"
    )
    async for msg in checker.receive_response():
        if isinstance(msg, ResultMessage):
            idle = msg.result.strip().lower() == "yes"

Pennies per evaluation. The supervisor doesn't need to understand task semantics — just whether to send a heartbeat or let the agent work.

Context monitoring

ResultMessage provides usage.input_tokens per turn and cost_usd for budget tracking. The supervisor accumulates input_tokens across turns and compares against the known window size (200K for Sonnet, 1M for Opus) to estimate fullness. There is no built-in "% context used" metric — the accumulated token count is the proxy.

Thresholds:

80% of context window — warn the agent, suggest wrapping up current subtask before the next one starts.
90% of context window — trigger shift-change sequence immediately.

Context exhaustion and shift-changes

When the accumulated token estimate crosses the shift-change threshold:

Supervisor calls agent.client.query() with a handoff prompt: "Write a handoff summary to the wiki and post a notice to your channels."
Agent writes the summary to the project wiki (via wiki MCP) and posts a shift-change notice to its task channel and #standup-{slug}.
Supervisor drains receive_response() to completion and records the final session_id from the ResultMessage.
Supervisor closes the ClaudeSDKClient context (async with exits).
Supervisor pops a new name from names.txt, spawns a fresh ClaudeSDKClient with --append-system-prompt containing the agent's role, new name, channel assignments, and a pointer to the wiki handoff page.
New agent reads the handoff page on its first turn and posts an introduction to its channels.

Session resume (session_id) is not used for shift-changes — a new session with a clean context is the point. Session resume is instead useful for supervisor restarts: if the supervisor crashes and restarts, it can reconnect to existing agent sessions by replaying the stored session_id values rather than forcing a shift-change on every live agent.

Git strategy

Multiple agents edit the same repos simultaneously. Branch isolation via git worktrees.

Worktree lifecycle

The supervisor creates worktrees before spawning agents. Agents never touch git worktree add/remove.

~/projects/repo/                        # bind-mounted, main branch (protected)
~/projects/repo/.worktrees/
  agent-task-42/                        # worktree for task 42
  agent-task-71/                        # worktree for task 71

Supervisor sequence per task:

git worktree add .worktrees/agent-task-{id} -b task/{id}
Spawn agent session with cwd set to the worktree
Pass the branch name in the agent's system prompt

Branch and merge strategy

Main branch is protected. Agents never get the main worktree path, only task worktree paths.
Agents commit freely to their task/{id} branch and push when done.
Human PM merges. No auto-merge for MVP. The supervisor or EM can rebase branches and annotate PRs, but the merge button stays with the human.
Conflict prevention: the EM avoids assigning overlapping file sets to concurrent agents. When conflicts happen anyway, the supervisor flags them for human resolution.

Cleanup

Normal completion: supervisor waits for push, then git worktree remove. Branch retained until merged.
Agent crash: supervisor commits any uncommitted work with [INCOMPLETE] prefix, removes worktree, flags task for reassignment.
On supervisor restart: reconcile state against git worktree list, clean up orphans.

Architecture components

Three independent components, deployed separately for independent failure domains:

1. ergo IRCd

Runs in an LXC container on a Proxmox server (16GB RAM).
Set-and-forget after initial configuration.
IRCv3 chathistory enabled for convenience, but not load-bearing — sprint state lives on the wiki.
maxline capability enabled for longer messages.
No TLS needed for LAN traffic in MVP.

2. IRC MCP bridge (FastMCP)

~200 lines of Python.
Wraps the transport abstraction with IRC backend.
Exposes the five tools above.
Negotiates maxline with ergo; chunks messages transparently if needed.
Connects to ergo over LAN.
Runs in a Docker container on the desktop.

3. Agent supervisor

A Python asyncio process that manages the lifecycle of all ClaudeSDKClient agent sessions. It is the only component that directly constructs or destroys agent sessions.

Agent session wrapper

@dataclass
class AgentSession:
    nick: str                    # IRC nick / human name
    role: str                    # "em" | "manager" | "worker"
    task_id: str | None          # None for the EM
    channels: list[str]          # IRC channels this agent monitors
    client: ClaudeSDKClient      # open SDK session
    session_id: str | None       # from last ResultMessage; persisted for resume
    input_tokens_total: int      # accumulated across all turns
    cost_usd_total: float        # accumulated cost
    last_active: datetime        # updated on each AssistantMessage
    worktree_path: Path | None   # git worktree for this agent's task

Session construction

async def spawn_agent(role, task_id, channels, handoff_wiki_path=None):
    nick = names_pool.pop()
    options = ClaudeAgentOptions(
        append_system_prompt=build_system_prompt(
            nick=nick, role=role, channels=channels,
            handoff_wiki_path=handoff_wiki_path,
        ),
        allowed_tools=["Read", "Edit", "Bash", "Glob", "Grep",
                       "mcp__irc_bridge__*", "mcp__dev_wiki__*"],
        cwd=worktree_path_for(task_id),
        mcp_servers=MCP_SERVER_CONFIGS,
    )
    client = ClaudeSDKClient(options=options)
    await client.__aenter__()
    return AgentSession(nick=nick, role=role, ...)

build_system_prompt() returns a string injected via append_system_prompt — it does not replace Claude Code's defaults, it appends to them. Content includes: agent name, role, channel assignments, and (if resuming from shift-change) a pointer to the wiki handoff page.

Supervisor restart recovery

On startup, the supervisor reads a persisted state file containing {nick, role, task_id, channels, session_id, tokens, cost, worktree_path} for each live agent. For each entry:

If the session is still resumable (session_id is valid), reconnect with resume=session_id and send a check-in prompt.
If resume fails, treat it as a shift-change: spawn a replacement and have it read the last handoff wiki page.

This means a supervisor crash does not force context loss — agent sessions can be reconnected.

Responsibilities summary

Responsibility	Mechanism
Spawn/retire agents	`ClaudeSDKClient` async context manager
Dispatch on `TASK:`	Direct IRC read → `client.query()`
Heartbeat / idle check	30s loop → Haiku idle classifier → `client.query()`
Context monitoring	Accumulate `ResultMessage.usage.input_tokens`
Shift-change	Handoff prompt → close → `spawn_agent()` with wiki path
Supervisor restart recovery	Persisted state file + `session_id` resume
Git worktrees	`git worktree add/remove` before/after agent spawn
Budget tracking	Accumulate `ResultMessage.cost_usd` per session

Desktop (128GB RAM)                    Proxmox (16GB RAM)
┌──────────────────────────────┐      ┌──────────────────┐
│ docker-compose               │      │ LXC container    │
│ ┌────────────┐ ┌───────────┐│      │ ┌──────────────┐ │
│ │ Supervisor │ │ IRC MCP   ││ LAN  │ │    ergo      │ │
│ │   (SDK)    │ │  Bridge   ││◄────►│ │    IRCd      │ │
│ └────────────┘ └───────────┘│      │ └──────────────┘ │
│ bind: ~/projects            │      └──────────────────┘
└──────────────────────────────┘
        ▲
        │ IRC client (terminal) or Claude.ai (PM delegate)
        PM

Agents receive two MCP server configs:

IRC bridge — communication with other agents and the PM
robot.wtf wiki — persistent shared state (sprint status, handoff summaries, task specs, decisions)

Relationship to existing Agent_Workflow

What carries forward unchanged:

Role definitions (manager, implementer, test runner, Groucho/Chico/Zeppo/Fixer/Documenter)
The proceed workflow (plan → implement → test → review → fix → document)
Model assignments (Opus for EM and managers, Sonnet for workers, Haiku for idle detection and documentation)
Review and fix loop limits (3 attempts before escalating)
Worker dispatch guidance (what context to give each worker type)

What changes:

Coordination moves from in-process Task/run_in_background to IRC channel messages via MCP
The orchestrator role splits: strategic coordination stays with the EM, human interaction moves to the channel
Question relay is replaced by direct channel participation — the PM is in the room (or their Claude.ai delegate is)
Task state lives on the wiki, conversation happens on IRC
Claude Code CLI replaced by Claude Code SDK for programmatic lifecycle management

MVP scope

ergo IRCd in LXC on Proxmox. Single binary, default config, enable chathistory and maxline.
IRC MCP bridge (~200 lines Python). FastMCP wrapping the transport abstraction. Five tools. Docker container.
Agent supervisor — Python, Claude Code SDK, Haiku idle-checker, shift-change logic, worktree management. Docker container.
docker-compose for bridge + supervisor on the desktop, bind-mounting the project directory.
robot.wtf project wiki for shared state (already exists).
One EM process — Opus, system-prompted as the engineering manager.
One manager process — spawned when the EM posts a task.
PM — connected to ergo from terminal (weechat/irssi).
One end-to-end task — EM assigns, manager runs the proceed workflow, PM observes from IRC, state persisted to wiki.

Not in MVP: multiple parallel workers, TLS, remote MCP auth (for Claude.ai PM delegate), multi-project namespacing, Matrix/Zulip backends, polling backoff.

Future: PM delegate via Claude.ai

Once the IRC MCP bridge is exposed as a remote MCP server (with auth), a Claude.ai session can connect to it and act as the PM's delegate. The PM talks to Claude.ai from their phone or browser; Claude.ai participates in IRC channels on their behalf. This replaces the need for a mobile IRC client entirely.

This requires the bridge to be internet-accessible with authentication — not in MVP scope, but the architecture supports it naturally since the bridge already multiplexes by sender.

Open questions

ergo maxline default. Verify the actual default and maximum configurable value. Agent plan summaries could be multi-KB.
SDK query() API stability. The session resume API (resume option) needs verification against current SDK version. Field names may have changed.

Resolved questions

Structured vs. conversational task format. Conversational wins. The whole point of using IRC is human observability. The only structured convention is a TASK: prefix on assignments.
CLI vs. SDK. SDK. The CLI's terminal processing is overhead for daemon use.
Single process vs. separate bridge and supervisor. Separate. Independent failure domains.
Launch-per-task vs. long-running agents. Long-running. Preserves context across related tasks.
Deployment topology. ergo in LXC on Proxmox, bridge + supervisor in docker-compose on desktop (128GB RAM).
Polling strategy. Two-layer: supervisor push for immediate dispatch, flat 30s polling as fallback/health check. Backoff deferred.
Git branch strategy. Supervisor-managed worktrees, branch-per-task, human merges, protected main.
Context monitoring. Supervisor accumulates usage.input_tokens per turn from SDK responses. No built-in SDK metric.
Durable state. robot.wtf project wiki via MCP. IRC is ephemeral conversation, wiki is the record. EM recovery reads wiki, not IRC history.
Mobile IRC client. Deferred. PM uses terminal for MVP. Future path: Claude.ai as PM delegate via remote MCP, bypassing the mobile client problem entirely.