Properties
category: design tags: [testing, playwright, e2e, infrastructure] last_updated: 2026-03-20 confidence: high
E2E Testing
End-to-end testing for robot.wtf using Playwright and a mock ATProto PDS.
Status: COMPLETE — 23 tests on main, deployed 2026-03-20.
Current State
23 passing E2E tests on main across 4 files, plus 294 unit tests.
Test files
test_login_flow.py (5 tests):
- Login page loads
- Client metadata endpoint serves valid ATProto OAuth metadata
- Full OAuth login flow (mock PDS → PAR → consent → callback → cookie)
- Logout clears cookie
- OAuth callback error shows flash (not 500)
test_auth_flows.py (4 tests):
- Auto-redirect when authenticated (visit
/auth/loginwith valid cookie →/app/) - Return-to URL preservation across OAuth redirect chain
- Login with handle (tests error handling for unresolvable mock handles)
- Unauthenticated access redirects to login with
return_to
test_wiki_lifecycle.py (8 tests):
- Wiki creation form (slug/name → submit → redirect → MCP token visible)
- Wiki settings update (change display_name → flash → persists on reload)
- Wiki deletion with confirmation (expand danger zone → confirm slug → delete)
- MCP token regeneration (click regen → JS confirm → new token in flash)
- Dashboard redirects to existing wiki
- Wiki deletion wrong slug rejected (confirm mismatch → flash error → wiki survives)
- Wiki creation duplicate slug rejected
- Wiki creation invalid slug rejected (bypasses browser validation, tests server-side)
- Wiki settings steady state (no token flash, regenerate button present)
test_account.py (6 tests):
- Account page renders (displays DID, handle)
- Account deletion wrong confirmation (wrong handle → error flash)
- MCP consent page renders (client info, wiki name, approve/deny buttons)
- Account deletion (correct handle → cookie cleared → redirected)
- MCP consent deny redirects with error
- Wiki settings steady state page elements
Infrastructure
tests/e2e/mock_pds.py— In-process mock ATProto PDS with PKCE verification, thread-safe statetests/e2e/conftest.py— Fixtures for single platform server, authenticated pages, wiki creation
Fixtures
platform_server(session): Starts consolidated Flask app in daemon thread on a free portauthenticated_page(function): Fresh browser context with validplatform_tokencookie via direct JWT mintingwiki_fixture(function): Creates wiki directly in DB + filesystem, cleans up after testdestructive_page(function): Separate browser context for tests that destroy statepds(session): Mock PDS in daemon threadtest_account(session): Test account on mock PDS
Production code changes for test mode
Gated by ALLOW_HTTP_PDS=true + FLASK_ENV=testing (RuntimeError at import if either is wrong):
app/auth/atproto_security.py—_ALLOW_HTTP_PDSflag, loopback SSRF relaxationapp/auth/atproto_identity.py—PLC_DIRECTORY_URLread at request time, skip bidirectional handle verificationapp/auth/atproto_oauth.py— Relax HTTPS/port assertions on auth server metadataapp/platform_server.py—_SCHEMEvariable, conditionalSESSION_COOKIE_SECURE, conditional cookiesecureflag, rate limiter disabled in test mode, limiter GC strong-reference fixapp/db.py—check_same_thread=Falsescoped toFLASK_ENV=testing
Bug fixes discovered during E2E work
resolve_did()SSRF: Upgraded from plainrequests.gettohardened_http(pre-existing vulnerability, elevated by injectablePLC_DIRECTORY_URL)- Flask-Limiter GC:
Limiterobject garbage collected aftercreate_app()returned due to weak references. Fixed with strong ref inapp.config["_LIMITER"]
Architecture Notes
Mock PDS
The mock PDS (tests/e2e/mock_pds.py) implements the full ATProto OAuth flow:
- Account creation/session management
- OAuth AS metadata, protected resource metadata
- PAR, authorize (HTML form), token exchange with PKCE S256 verification
- DID document serving (acts as PLC directory)
- Thread-safe global state with
threading.Lock
All on 127.0.0.1 to avoid IPv6 resolution issues.
Test mode env vars
ALLOW_HTTP_PDS=true— relaxes SSRF protections for loopback HTTP (guarded byFLASK_ENV=testing)PLC_DIRECTORY_URL— points at mock PDS for DID resolution (read at request time inresolve_did())PLATFORM_DOMAIN=127.0.0.1:{port}— makes CLIENT_ID/REDIRECT_URI use HTTPWIKI_TEMPLATE_DIR— pointed at nonexistent path for predictable fallback behavior
Future Directions (priority order)
1. Resolver permission tests (HIGH)
The TenantResolver is the only thing preventing cross-tenant access. No E2E test hits a wiki subdomain. The is_bearer_token bypass, _apply_wiki_access_restrictions, and the internal API key path are untested end-to-end. Requires routing to a second Host in the test environment (Playwright supports set_extra_http_headers).
2. Multi-user fixtures (HIGH)
Single test account means ownership isolation is untested. Add test_account_b (mock PDS already supports multiple accounts). Test: user B cannot access user A's wiki settings, user B gets appropriate access level on user A's wiki content.
3. Fix CI pipeline (HIGH, low effort)
Current ci.yml doesn't install Playwright browsers. Needs: playwright install chromium, separate unit/E2E jobs, browser caching (~/.cache/ms-playwright), --screenshot=only-on-failure artifacts, --timeout=60.
4. Infrastructure hardening (MEDIUM)
- Port allocation race: bind-then-close gap before
make_server. Pass bound socket directly. - Silent teardown:
wiki_fixtureswallows cleanup exceptions. Log them. - Session-scoped
pagefixture leaks state between tests.
5. MCP consent + tool invocation E2E (MEDIUM)
The MCP server (otterwiki-mcp/ repo, separate from mcp_entry.py sidecar) has 12 real tools wrapping the REST API. E2E testing the full flow — consent → token → tool invocation — is feasible now. The consent HMAC signing is security-critical.
6. Rate limit enforcement (LOW)
One test: 6 rapid writes, assert 6th returns 429. Catches wiring bugs where the limiter is instantiated but never called.
7. Otterwiki integration (DEFERRED)
Full path: login → create wiki → visit subdomain → see content. Requires otterwiki installed in CI and subprocess management. Defer until CI infrastructure is more mature.
