Properties
category: design tags: [architecture, infrastructure, refactoring] last_updated: 2026-03-19 confidence: high
Server Consolidation: Merge auth_server + api_server
Merge the auth service (port 8003) and management/API service (port 8002) into a single Flask app on port 8002. Otterwiki (port 8000) and MCP sidecar (port 8001) remain separate.
Motivation
auth_server and api_server are tightly coupled:
- Same database (robot.db)
- Same models (UserModel, WikiModel)
- Same signing keys (RSA for JWT, EC for ATProto client)
- Same cookie (platform_token on .robot.wtf)
- Same user identity model (DID-based)
The separation creates concrete problems:
- E2E testing: Cookies set on port 8003 aren't sent to port 8002. Standing up two Flask servers in test fixtures requires cross-thread SQLite coordination. An implementation agent burned its entire context trying to solve this.
- Operational overhead: Two systemd services, two Gunicorn configs, two health checks for what is logically one "platform" service.
- Code duplication: Both apps call
_load_keys(),get_connection(),init_schema()independently. Both have their own rate limiting setup.
Current Architecture
Caddy (TLS, port 80/443)
├─ robot.wtf/auth/* → robot-auth (Gunicorn, port 8003, auth_server.py)
├─ robot.wtf/app/* → robot-api (Gunicorn, port 8002, api_server.py)
├─ robot.wtf/api/* → robot-api (port 8002)
├─ {slug}.robot.wtf/mcp → robot-mcp (uvicorn, port 8001)
├─ {slug}.robot.wtf/api/v1/* → robot-otterwiki (Gunicorn, port 8000, wsgi.py)
└─ {slug}.robot.wtf/* → robot-otterwiki (port 8000)
Four processes, three entry points (auth_server:application, api_server:application, wsgi:application), plus the MCP sidecar.
Target Architecture
Caddy (TLS, port 80/443)
├─ robot.wtf/auth/* → robot-platform (Gunicorn, port 8002, platform_server.py)
├─ robot.wtf/app/* → robot-platform (port 8002)
├─ robot.wtf/api/* → robot-platform (port 8002)
├─ {slug}.robot.wtf/mcp → robot-mcp (uvicorn, port 8001)
├─ {slug}.robot.wtf/api/v1/* → robot-otterwiki (Gunicorn, port 8000, wsgi.py)
└─ {slug}.robot.wtf/* → robot-otterwiki (port 8000)
Three processes, two entry points. Caddy routing unchanged (both /auth/* and /app/* already route to the same IP, just different ports — changing to same port is a one-line edit).
What Changes
New: app/platform_server.py
Single Flask app factory that combines auth and management routes:
def create_app(*, db_path=None, client_jwk_path=None, signing_key_path=None): app = Flask(__name__, template_folder="templates") # templates/auth/ — login.html, consent.html, error.html, base.html # templates/management/ — layout.html, wiki_create.html, etc. # Shared setup: secret key, keys, DB, models, rate limiter ... # Auth routes (/auth/*) _register_auth_routes(app, platform_jwt, client_secret_jwk, ...) # Management UI routes (/app/*) _register_management_ui_routes(app, platform_jwt, wiki_model, user_model, ...) # Management API routes (/api/*) _register_management_api_routes(app, wiki_model, user_model, ...) # Well-known routes _register_wellknown_routes(app, ...) return app
The route registration functions extract the existing route definitions from auth_server.py and api_server.py into callable functions that take a Flask app and shared dependencies as arguments.
Removed
app/auth_server.py— routes moved to platform_server.pyansible/roles/deploy/templates/robot-auth.service.j2— systemd service removedansible/roles/deploy/templates/gunicorn-auth.conf.py.j2— Gunicorn config removed
Modified
app/api_server.py→ renamed/merged intoplatform_server.pyansible/roles/deploy/templates/Caddyfile.j2— remove auth port, route/auth/*to port 8002ansible/roles/deploy/tasks/main.yml— remove auth service deploymentapp/auth/templates/— move toapp/templates/auth/app/management/templates/— move toapp/templates/management/
Unchanged
app/wsgi.py— otterwiki entry point, completely independentapp/resolver.py— TenantResolver wraps otterwiki, not the platform serviceapp/management/routes.py— ManagementMiddleware still wraps the platform Flask app- All auth logic — unchanged, just relocated
- All management logic — unchanged
- Database schema — unchanged
- MCP sidecar — unchanged
ManagementMiddleware Handling
Currently, api_server.py wraps the Flask app with ManagementMiddleware (a WSGI middleware that intercepts /api/* for rate limiting and auth). Auth routes don't go through this middleware.
After consolidation, ManagementMiddleware still wraps the combined Flask app. It already passes through paths it doesn't handle — /auth/* routes will pass through to Flask unchanged. No middleware changes needed.
Verify by reading ManagementMiddleware's __call__ — it only intercepts paths matching its configured prefixes (/api/). All other paths pass to the wrapped app.
Template Directory Structure
Before:
app/auth/templates/ — base.html, login.html, consent.html, error.html app/management/templates/ — layout.html, wiki_create.html, wiki_settings.html, account.html
After:
app/templates/ auth/ — base.html, login.html, consent.html, error.html management/ — layout.html, wiki_create.html, wiki_settings.html, account.html
Template references in route code change from render_template("login.html") to render_template("auth/login.html"). Mechanical find-and-replace.
Database Connection Strategy
Both apps currently use get_connection() which opens a new SQLite connection per call. The consolidated app continues this pattern — one connection per request via Flask's g object and teardown_appcontext.
The auth_server pattern (_get_db() storing in g._database) is cleaner than api_server's approach (connection at startup). Adopt the per-request pattern throughout.
Rate Limiting
- Auth routes: Flask-Limiter with per-route decorators (
@limiter.limit("1/minute")) - Management API routes: WSGIRateLimiter singleton in ManagementMiddleware
Both can coexist — Flask-Limiter operates at the Flask level, WSGIRateLimiter at the WSGI level. No conflict.
Session and Cookie
One Flask app = one secret_key = one session. The platform_token cookie is set with domain=COOKIE_DOMAIN, which is the same regardless of which routes set it. No changes needed.
E2E Testing Impact
The consolidation directly unblocks E2E testing:
- One server fixture instead of two
- Cookies work naturally (same origin)
- No SQLite cross-thread issues
authenticated_pagefixture just logs in and the cookie works for all routes- The 11 planned E2E tests become straightforward
Implementation Sequence
- Create
app/platform_server.pywith combined app factory - Move templates to
app/templates/{auth,management}/ - Update
render_template()calls with subdirectory prefixes - Verify all existing unit tests pass against the new structure
- Update Ansible: remove auth service, update Caddy routes
- Deploy and verify
- Remove old
auth_server.pyandapi_server.py - Resume E2E test implementation with simplified fixtures
Risks
- Merge complexity: The two app factories have different initialization patterns. Reconciling them requires care but isn't architecturally novel.
- Template path changes: Every
render_template()call needs updating. Mechanical but easy to miss one. - Existing unit tests: Tests that import
auth_server.create_app()orapi_server._create_flask_app()need updating. Many tests — but the change is the same for each. - Deployment window: The Ansible change removes one service and modifies Caddy. Brief downtime for auth routes during deploy. Mitigate by deploying the combined service first (both ports), then removing the old auth service.
