--- category: design tags: [testing, playwright, e2e, infrastructure] last_updated: 2026-03-20 confidence: high --- # E2E Testing End-to-end testing for robot.wtf using Playwright and a mock ATProto PDS. **Status: COMPLETE** — 23 tests on `main`, deployed 2026-03-20. ## Current State **23 passing E2E tests** on `main` across 4 files, plus 294 unit tests. ### Test files **`test_login_flow.py`** (5 tests): 1. Login page loads 2. Client metadata endpoint serves valid ATProto OAuth metadata 3. Full OAuth login flow (mock PDS → PAR → consent → callback → cookie) 4. Logout clears cookie 5. OAuth callback error shows flash (not 500) **`test_auth_flows.py`** (4 tests): 1. Auto-redirect when authenticated (visit `/auth/login` with valid cookie → `/app/`) 2. Return-to URL preservation across OAuth redirect chain 3. Login with handle (tests error handling for unresolvable mock handles) 4. Unauthenticated access redirects to login with `return_to` **`test_wiki_lifecycle.py`** (8 tests): 1. Wiki creation form (slug/name → submit → redirect → MCP token visible) 2. Wiki settings update (change display_name → flash → persists on reload) 3. Wiki deletion with confirmation (expand danger zone → confirm slug → delete) 4. MCP token regeneration (click regen → JS confirm → new token in flash) 5. Dashboard redirects to existing wiki 6. Wiki deletion wrong slug rejected (confirm mismatch → flash error → wiki survives) 7. Wiki creation duplicate slug rejected 8. Wiki creation invalid slug rejected (bypasses browser validation, tests server-side) 9. Wiki settings steady state (no token flash, regenerate button present) **`test_account.py`** (6 tests): 1. Account page renders (displays DID, handle) 2. Account deletion wrong confirmation (wrong handle → error flash) 3. MCP consent page renders (client info, wiki name, approve/deny buttons) 4. Account deletion (correct handle → cookie cleared → redirected) 5. MCP consent deny redirects with error 6. Wiki settings steady state page elements ### Infrastructure - `tests/e2e/mock_pds.py` — In-process mock ATProto PDS with PKCE verification, thread-safe state - `tests/e2e/conftest.py` — Fixtures for single platform server, authenticated pages, wiki creation ### Fixtures - **`platform_server`** (session): Starts consolidated Flask app in daemon thread on a free port - **`authenticated_page`** (function): Fresh browser context with valid `platform_token` cookie via direct JWT minting - **`wiki_fixture`** (function): Creates wiki directly in DB + filesystem, cleans up after test - **`destructive_page`** (function): Separate browser context for tests that destroy state - **`pds`** (session): Mock PDS in daemon thread - **`test_account`** (session): Test account on mock PDS ### Production code changes for test mode Gated by `ALLOW_HTTP_PDS=true` + `FLASK_ENV=testing` (RuntimeError at import if either is wrong): - `app/auth/atproto_security.py` — `_ALLOW_HTTP_PDS` flag, loopback SSRF relaxation - `app/auth/atproto_identity.py` — `PLC_DIRECTORY_URL` read at request time, skip bidirectional handle verification - `app/auth/atproto_oauth.py` — Relax HTTPS/port assertions on auth server metadata - `app/platform_server.py` — `_SCHEME` variable, conditional `SESSION_COOKIE_SECURE`, conditional cookie `secure` flag, rate limiter disabled in test mode, limiter GC strong-reference fix - `app/db.py` — `check_same_thread=False` scoped to `FLASK_ENV=testing` ### Bug fixes discovered during E2E work - `resolve_did()` SSRF: Upgraded from plain `requests.get` to `hardened_http` (pre-existing vulnerability, elevated by injectable `PLC_DIRECTORY_URL`) - Flask-Limiter GC: `Limiter` object garbage collected after `create_app()` returned due to weak references. Fixed with strong ref in `app.config["_LIMITER"]` ## Architecture Notes ### Mock PDS The mock PDS (`tests/e2e/mock_pds.py`) implements the full ATProto OAuth flow: - Account creation/session management - OAuth AS metadata, protected resource metadata - PAR, authorize (HTML form), token exchange with PKCE S256 verification - DID document serving (acts as PLC directory) - Thread-safe global state with `threading.Lock` All on `127.0.0.1` to avoid IPv6 resolution issues. ### Test mode env vars - `ALLOW_HTTP_PDS=true` — relaxes SSRF protections for loopback HTTP (guarded by `FLASK_ENV=testing`) - `PLC_DIRECTORY_URL` — points at mock PDS for DID resolution (read at request time in `resolve_did()`) - `PLATFORM_DOMAIN=127.0.0.1:{port}` — makes CLIENT_ID/REDIRECT_URI use HTTP - `WIKI_TEMPLATE_DIR` — pointed at nonexistent path for predictable fallback behavior ## Future Directions (priority order) ### 1. Resolver permission tests (HIGH) The `TenantResolver` is the only thing preventing cross-tenant access. No E2E test hits a wiki subdomain. The `is_bearer_token` bypass, `_apply_wiki_access_restrictions`, and the internal API key path are untested end-to-end. Requires routing to a second Host in the test environment (Playwright supports `set_extra_http_headers`). ### 2. Multi-user fixtures (HIGH) Single test account means ownership isolation is untested. Add `test_account_b` (mock PDS already supports multiple accounts). Test: user B cannot access user A's wiki settings, user B gets appropriate access level on user A's wiki content. ### 3. Fix CI pipeline (HIGH, low effort) Current `ci.yml` doesn't install Playwright browsers. Needs: `playwright install chromium`, separate unit/E2E jobs, browser caching (`~/.cache/ms-playwright`), `--screenshot=only-on-failure` artifacts, `--timeout=60`. ### 4. Infrastructure hardening (MEDIUM) - Port allocation race: bind-then-close gap before `make_server`. Pass bound socket directly. - Silent teardown: `wiki_fixture` swallows cleanup exceptions. Log them. - Session-scoped `page` fixture leaks state between tests. ### 5. MCP consent + tool invocation E2E (MEDIUM) The MCP server (`otterwiki-mcp/` repo, separate from `mcp_entry.py` sidecar) has 12 real tools wrapping the REST API. E2E testing the full flow — consent → token → tool invocation — is feasible now. The consent HMAC signing is security-critical. ### 6. Rate limit enforcement (LOW) One test: 6 rapid writes, assert 6th returns 429. Catches wiring bugs where the limiter is instantiated but never called. ### 7. Otterwiki integration (DEFERRED) Full path: login → create wiki → visit subdomain → see content. Requires otterwiki installed in CI and subprocess management. Defer until CI infrastructure is more mature.
