Properties
category: design
tags: [testing, playwright, e2e, infrastructure]
last_updated: 2026-03-19
confidence: high

E2E Testing

End-to-end testing for robot.wtf using Playwright and a mock ATProto PDS.

Current State

Branch: feat/e2e-tests on robot.wtf repo (4 commits)

4 passing tests in tests/e2e/test_login_flow.py:

  1. Login page loads
  2. Client metadata endpoint serves valid ATProto OAuth metadata
  3. Full OAuth login flow (mock PDS → PAR → consent → callback → cookie)
  4. Logout clears cookie

Infrastructure built:

  • tests/e2e/mock_pds.py — In-process mock ATProto PDS (OAuth endpoints, account creation, DID docs)
  • tests/e2e/conftest.py — Fixtures for PDS (mock or Docker), test account, key generation, auth server in background thread, Playwright browser context
  • tests/e2e/docker-compose.yml — Real PDS for CI environments with Docker
  • .github/workflows/e2e.yml — CI workflow
  • Production code changes gated by ALLOW_HTTP_PDS env var (requires FLASK_ENV=testing, raises RuntimeError otherwise)

Partially written test files (on branch, need fixtures):

  • tests/e2e/test_wiki_lifecycle.py — 4 tests written by Agent B
  • tests/e2e/test_account.py — 4 tests written by Agent C (MCP consent test marked skip)

Blocked On

Server consolidation (see Design/Server_Consolidation). The auth_server and api_server are separate Flask apps on different ports. This causes:

  • Cookie cross-port sharing failures
  • SQLite threading issues with two in-process Flask servers
  • An implementation agent burned its entire context trying to work around this

Once auth + api are merged into a single Flask app, the E2E fixtures simplify dramatically.

Plan After Consolidation

Step 1: Simplify conftest.py

The auth_server fixture currently starts only the auth Flask app. Post-consolidation, it starts the single platform app, which serves both /auth/* and /app/* routes. Rename to platform_server or just server.

No management_server fixture needed — it's the same app.

No cross-port cookie injection needed — same origin.

Step 2: Add new fixtures

authenticated_page (function-scoped):

  • Logs in via mock PDS OAuth flow
  • Returns a Playwright page with valid platform_token cookie
  • Cookie works for all routes (same origin)

wiki_fixture (function-scoped):

  • Creates a wiki directly in DB + filesystem (bypasses route to avoid tier limits)
  • Calls WikiModel.create() + _init_wiki_repo() + _init_wiki_db()
  • Cleans up wiki dir + DB row after test

destructive_page (function-scoped):

  • Separate browser context for tests that destroy state (account deletion, wiki deletion)
  • Prevents cookie/state pollution to other tests

Step 3: Implement 11 additional tests

Auth flows (test_auth_flows.py, 3 tests)

  • Auto-redirect when authenticated: Visit /auth/login with valid cookie → redirects to /app/
  • Return-to URL preservation: return_to parameter survives the full OAuth redirect chain
  • Login with DID: Login using DID instead of handle (may be redundant with existing test_oauth_login — check)

Wiki lifecycle (test_wiki_lifecycle.py, 4 tests — already written)

  • Wiki creation form: Fill slug/name → submit → redirect to settings → MCP token visible
  • Wiki settings update: Change display_name → flash message → persists on reload
  • Wiki deletion with confirmation: Expand danger zone → confirm slug → delete → flash
  • MCP token regeneration: Click regen → JS confirm dialog → new token in flash

Account management (test_account.py, 3 tests — already written)

  • Account page renders: Displays DID, handle, created_at from JWT claims
  • Account deletion: Confirm handle → cookie cleared → cascading wiki delete
  • Account deletion wrong confirmation: Wrong handle → stays on page → error flash
  • MCP consent page renders: Consent page shows client info, wiki name, approve/deny buttons

Step 4: Verify existing test files

Agent B's test_wiki_lifecycle.py and Agent C's test_account.py were written against fixture signatures that may differ from the simplified post-consolidation conftest. Review and update selectors/fixture names before running.

Step 5: Run full suite

All 15 tests (4 existing + 11 new) should pass. Run unit tests too to verify no regressions.

Architecture Notes

Mock PDS

The mock PDS (tests/e2e/mock_pds.py) implements:

  • POST /xrpc/com.atproto.server.createAccount — creates test accounts with did:plc: DIDs
  • POST /xrpc/com.atproto.server.createSession — handles re-use of existing accounts
  • GET /.well-known/oauth-authorization-server — AS metadata
  • GET /.well-known/oauth-protected-resource — protected resource metadata
  • POST /oauth/par — Pushed Authorization Request
  • GET/POST /oauth/authorize — Login + consent form (simple HTML, not React SPA)
  • POST /oauth/token — Token exchange (skips PKCE verification)
  • GET /did:plc:* — DID document serving (acts as PLC directory)

All on 127.0.0.1 to avoid IPv6 resolution issues.

Test mode env vars

  • ALLOW_HTTP_PDS=true — relaxes SSRF protections for loopback HTTP (guarded by FLASK_ENV=testing)
  • PLC_DIRECTORY_URL — points at mock PDS for DID resolution
  • PLATFORM_DOMAIN=127.0.0.1:{port} — makes CLIENT_ID/REDIRECT_URI use HTTP
  • WIKI_TEMPLATE_DIR — pointed at nonexistent path for predictable fallback behavior

Docker vs mock

Conftest has a 3-tier fallback: external PDS already running → Docker Compose → in-process mock. CI with Docker gets the real PDS; devcontainers without Docker get the mock. The mock is sufficient for all current tests.