Commit 82f820

2026-03-19 23:18:55 Claude (MCP): [mcp] Revamp agent supervisor section with concrete ClaudeSDKClient design
Minsky/Agent_IRC_Architecture.md ..
@@ 270,15 270,68 @@
### 3. Agent supervisor
- - Python process using the Claude Code SDK.
- - Spawns and manages agent sessions (EM, managers, workers).
- - Watches IRC for `TASK:` prefixes to push-dispatch agents.
- - Monitors context usage (accumulated input tokens per session).
- - Handles shift-changes (summarize to wiki → kill → respawn with wiki context).
- - Runs the Haiku idle-checker.
- - Creates and cleans up git worktrees.
- - Runs in a Docker container on the desktop, alongside the bridge.
- - Bind-mounts the project directory from the host.
+ A Python asyncio process that manages the lifecycle of all `ClaudeSDKClient` agent sessions. It is the only component that directly constructs or destroys agent sessions.
+
+ #### Agent session wrapper
+
+ ```python
+ @dataclass
+ class AgentSession:
+ nick: str # IRC nick / human name
+ role: str # "em" | "manager" | "worker"
+ task_id: str | None # None for the EM
+ channels: list[str] # IRC channels this agent monitors
+ client: ClaudeSDKClient # open SDK session
+ session_id: str | None # from last ResultMessage; persisted for resume
+ input_tokens_total: int # accumulated across all turns
+ cost_usd_total: float # accumulated cost
+ last_active: datetime # updated on each AssistantMessage
+ worktree_path: Path | None # git worktree for this agent's task
+ ```
+
+ #### Session construction
+
+ ```python
+ async def spawn_agent(role, task_id, channels, handoff_wiki_path=None):
+ nick = names_pool.pop()
+ options = ClaudeAgentOptions(
+ append_system_prompt=build_system_prompt(
+ nick=nick, role=role, channels=channels,
+ handoff_wiki_path=handoff_wiki_path,
+ ),
+ allowed_tools=["Read", "Edit", "Bash", "Glob", "Grep",
+ "mcp__irc_bridge__*", "mcp__dev_wiki__*"],
+ cwd=worktree_path_for(task_id),
+ mcp_servers=MCP_SERVER_CONFIGS,
+ )
+ client = ClaudeSDKClient(options=options)
+ await client.__aenter__()
+ return AgentSession(nick=nick, role=role, ...)
+ ```
+
+ `build_system_prompt()` returns a string injected via `append_system_prompt` — it does not replace Claude Code's defaults, it appends to them. Content includes: agent name, role, channel assignments, and (if resuming from shift-change) a pointer to the wiki handoff page.
+
+ #### Supervisor restart recovery
+
+ On startup, the supervisor reads a persisted state file containing `{nick, role, task_id, channels, session_id, tokens, cost, worktree_path}` for each live agent. For each entry:
+
+ - If the session is still resumable (`session_id` is valid), reconnect with `resume=session_id` and send a check-in prompt.
+ - If resume fails, treat it as a shift-change: spawn a replacement and have it read the last handoff wiki page.
+
+ This means a supervisor crash does not force context loss — agent sessions can be reconnected.
+
+ #### Responsibilities summary
+
+ | Responsibility | Mechanism |
+ |---|---|
+ | Spawn/retire agents | `ClaudeSDKClient` async context manager |
+ | Dispatch on `TASK:` | Direct IRC read → `client.query()` |
+ | Heartbeat / idle check | 30s loop → Haiku idle classifier → `client.query()` |
+ | Context monitoring | Accumulate `ResultMessage.usage.input_tokens` |
+ | Shift-change | Handoff prompt → close → `spawn_agent()` with wiki path |
+ | Supervisor restart recovery | Persisted state file + `session_id` resume |
+ | Git worktrees | `git worktree add/remove` before/after agent spawn |
+ | Budget tracking | Accumulate `ResultMessage.cost_usd` per session |
```
Desktop (128GB RAM) Proxmox (16GB RAM)
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9