Blame
|
1 | --- |
||||||
| 2 | category: design |
|||||||
| 3 | tags: [testing, playwright, e2e, infrastructure] |
|||||||
|
4 | last_updated: 2026-03-20 |
||||||
|
5 | confidence: high |
||||||
| 6 | --- |
|||||||
| 7 | ||||||||
| 8 | # E2E Testing |
|||||||
| 9 | ||||||||
| 10 | End-to-end testing for robot.wtf using Playwright and a mock ATProto PDS. |
|||||||
| 11 | ||||||||
|
12 | **Status: COMPLETE** — 23 tests on `main`, deployed 2026-03-20. |
||||||
| 13 | ||||||||
|
14 | ## Current State |
||||||
| 15 | ||||||||
|
16 | **23 passing E2E tests** on `main` across 4 files, plus 294 unit tests. |
||||||
| 17 | ||||||||
| 18 | ### Test files |
|||||||
|
19 | |||||||
|
20 | **`test_login_flow.py`** (5 tests): |
||||||
|
21 | 1. Login page loads |
||||||
| 22 | 2. Client metadata endpoint serves valid ATProto OAuth metadata |
|||||||
| 23 | 3. Full OAuth login flow (mock PDS → PAR → consent → callback → cookie) |
|||||||
| 24 | 4. Logout clears cookie |
|||||||
|
25 | 5. OAuth callback error shows flash (not 500) |
||||||
| 26 | ||||||||
| 27 | **`test_auth_flows.py`** (4 tests): |
|||||||
| 28 | 1. Auto-redirect when authenticated (visit `/auth/login` with valid cookie → `/app/`) |
|||||||
| 29 | 2. Return-to URL preservation across OAuth redirect chain |
|||||||
| 30 | 3. Login with handle (tests error handling for unresolvable mock handles) |
|||||||
| 31 | 4. Unauthenticated access redirects to login with `return_to` |
|||||||
| 32 | ||||||||
| 33 | **`test_wiki_lifecycle.py`** (8 tests): |
|||||||
| 34 | 1. Wiki creation form (slug/name → submit → redirect → MCP token visible) |
|||||||
| 35 | 2. Wiki settings update (change display_name → flash → persists on reload) |
|||||||
| 36 | 3. Wiki deletion with confirmation (expand danger zone → confirm slug → delete) |
|||||||
| 37 | 4. MCP token regeneration (click regen → JS confirm → new token in flash) |
|||||||
| 38 | 5. Dashboard redirects to existing wiki |
|||||||
| 39 | 6. Wiki deletion wrong slug rejected (confirm mismatch → flash error → wiki survives) |
|||||||
| 40 | 7. Wiki creation duplicate slug rejected |
|||||||
| 41 | 8. Wiki creation invalid slug rejected (bypasses browser validation, tests server-side) |
|||||||
| 42 | 9. Wiki settings steady state (no token flash, regenerate button present) |
|||||||
| 43 | ||||||||
| 44 | **`test_account.py`** (6 tests): |
|||||||
| 45 | 1. Account page renders (displays DID, handle) |
|||||||
| 46 | 2. Account deletion wrong confirmation (wrong handle → error flash) |
|||||||
| 47 | 3. MCP consent page renders (client info, wiki name, approve/deny buttons) |
|||||||
| 48 | 4. Account deletion (correct handle → cookie cleared → redirected) |
|||||||
| 49 | 5. MCP consent deny redirects with error |
|||||||
| 50 | 6. Wiki settings steady state page elements |
|||||||
| 51 | ||||||||
| 52 | ### Infrastructure |
|||||||
| 53 | ||||||||
| 54 | - `tests/e2e/mock_pds.py` — In-process mock ATProto PDS with PKCE verification, thread-safe state |
|||||||
| 55 | - `tests/e2e/conftest.py` — Fixtures for single platform server, authenticated pages, wiki creation |
|||||||
| 56 | ||||||||
| 57 | ### Fixtures |
|||||||
| 58 | ||||||||
| 59 | - **`platform_server`** (session): Starts consolidated Flask app in daemon thread on a free port |
|||||||
| 60 | - **`authenticated_page`** (function): Fresh browser context with valid `platform_token` cookie via direct JWT minting |
|||||||
| 61 | - **`wiki_fixture`** (function): Creates wiki directly in DB + filesystem, cleans up after test |
|||||||
| 62 | - **`destructive_page`** (function): Separate browser context for tests that destroy state |
|||||||
| 63 | - **`pds`** (session): Mock PDS in daemon thread |
|||||||
| 64 | - **`test_account`** (session): Test account on mock PDS |
|||||||
| 65 | ||||||||
| 66 | ### Production code changes for test mode |
|||||||
| 67 | ||||||||
| 68 | Gated by `ALLOW_HTTP_PDS=true` + `FLASK_ENV=testing` (RuntimeError at import if either is wrong): |
|||||||
| 69 | ||||||||
| 70 | - `app/auth/atproto_security.py` — `_ALLOW_HTTP_PDS` flag, loopback SSRF relaxation |
|||||||
| 71 | - `app/auth/atproto_identity.py` — `PLC_DIRECTORY_URL` read at request time, skip bidirectional handle verification |
|||||||
| 72 | - `app/auth/atproto_oauth.py` — Relax HTTPS/port assertions on auth server metadata |
|||||||
| 73 | - `app/platform_server.py` — `_SCHEME` variable, conditional `SESSION_COOKIE_SECURE`, conditional cookie `secure` flag, rate limiter disabled in test mode, limiter GC strong-reference fix |
|||||||
| 74 | - `app/db.py` — `check_same_thread=False` scoped to `FLASK_ENV=testing` |
|||||||
| 75 | ||||||||
| 76 | ### Bug fixes discovered during E2E work |
|||||||
| 77 | ||||||||
| 78 | - `resolve_did()` SSRF: Upgraded from plain `requests.get` to `hardened_http` (pre-existing vulnerability, elevated by injectable `PLC_DIRECTORY_URL`) |
|||||||
| 79 | - Flask-Limiter GC: `Limiter` object garbage collected after `create_app()` returned due to weak references. Fixed with strong ref in `app.config["_LIMITER"]` |
|||||||
|
80 | |||||||
| 81 | ## Architecture Notes |
|||||||
| 82 | ||||||||
| 83 | ### Mock PDS |
|||||||
|
84 | The mock PDS (`tests/e2e/mock_pds.py`) implements the full ATProto OAuth flow: |
||||||
| 85 | - Account creation/session management |
|||||||
| 86 | - OAuth AS metadata, protected resource metadata |
|||||||
| 87 | - PAR, authorize (HTML form), token exchange with PKCE S256 verification |
|||||||
| 88 | - DID document serving (acts as PLC directory) |
|||||||
| 89 | - Thread-safe global state with `threading.Lock` |
|||||||
|
90 | |||||||
| 91 | All on `127.0.0.1` to avoid IPv6 resolution issues. |
|||||||
| 92 | ||||||||
| 93 | ### Test mode env vars |
|||||||
| 94 | - `ALLOW_HTTP_PDS=true` — relaxes SSRF protections for loopback HTTP (guarded by `FLASK_ENV=testing`) |
|||||||
|
95 | - `PLC_DIRECTORY_URL` — points at mock PDS for DID resolution (read at request time in `resolve_did()`) |
||||||
|
96 | - `PLATFORM_DOMAIN=127.0.0.1:{port}` — makes CLIENT_ID/REDIRECT_URI use HTTP |
||||||
| 97 | - `WIKI_TEMPLATE_DIR` — pointed at nonexistent path for predictable fallback behavior |
|||||||
|
98 | |||||||
| 99 | ## Future Directions (priority order) |
|||||||
| 100 | ||||||||
| 101 | ### 1. Resolver permission tests (HIGH) |
|||||||
| 102 | The `TenantResolver` is the only thing preventing cross-tenant access. No E2E test hits a wiki subdomain. The `is_bearer_token` bypass, `_apply_wiki_access_restrictions`, and the internal API key path are untested end-to-end. Requires routing to a second Host in the test environment (Playwright supports `set_extra_http_headers`). |
|||||||
| 103 | ||||||||
| 104 | ### 2. Multi-user fixtures (HIGH) |
|||||||
| 105 | Single test account means ownership isolation is untested. Add `test_account_b` (mock PDS already supports multiple accounts). Test: user B cannot access user A's wiki settings, user B gets appropriate access level on user A's wiki content. |
|||||||
| 106 | ||||||||
| 107 | ### 3. Fix CI pipeline (HIGH, low effort) |
|||||||
| 108 | Current `ci.yml` doesn't install Playwright browsers. Needs: `playwright install chromium`, separate unit/E2E jobs, browser caching (`~/.cache/ms-playwright`), `--screenshot=only-on-failure` artifacts, `--timeout=60`. |
|||||||
| 109 | ||||||||
| 110 | ### 4. Infrastructure hardening (MEDIUM) |
|||||||
| 111 | - Port allocation race: bind-then-close gap before `make_server`. Pass bound socket directly. |
|||||||
| 112 | - Silent teardown: `wiki_fixture` swallows cleanup exceptions. Log them. |
|||||||
| 113 | - Session-scoped `page` fixture leaks state between tests. |
|||||||
| 114 | ||||||||
| 115 | ### 5. MCP consent + tool invocation E2E (MEDIUM) |
|||||||
| 116 | The MCP server (`otterwiki-mcp/` repo, separate from `mcp_entry.py` sidecar) has 12 real tools wrapping the REST API. E2E testing the full flow — consent → token → tool invocation — is feasible now. The consent HMAC signing is security-critical. |
|||||||
| 117 | ||||||||
| 118 | ### 6. Rate limit enforcement (LOW) |
|||||||
| 119 | One test: 6 rapid writes, assert 6th returns 429. Catches wiring bugs where the limiter is instantiated but never called. |
|||||||
| 120 | ||||||||
| 121 | ### 7. Otterwiki integration (DEFERRED) |
|||||||
| 122 | Full path: login → create wiki → visit subdomain → see content. Requires otterwiki installed in CI and subprocess management. Defer until CI infrastructure is more mature. |
|||||||
