Wikibot.io Phase Gates — Phases 0–4

This document defines exit criteria and validation procedures for each phase boundary. The human reviews these before giving the go-ahead to proceed.

How phase gates work

Phase manager completes all tasks in the phase
Manager writes Dev/Phase N Summary to the wiki with results
Human reviews the summary and this document's exit criteria
Human performs the validation steps below
Go/no-go decision — if go, the next phase's manager can start

Phase 0 Gate: Proof of Concept

Exit criteria

Criterion	Target	How to verify
EFS warm read latency	< 500ms	Benchmark results in `Dev/Phase 0 — EFS Benchmarks`
EFS warm write latency	< 1s	Benchmark results
Lambda cold start (VPC + EFS)	< 5s	Benchmark results
Concurrent reads	No errors	Benchmark results (3+ simultaneous)
Concurrent writes	Serialized correctly	Benchmark results (5 simultaneous, git locking)
MCP OAuth via WorkOS	Working	Claude.ai connects and calls echo tool
Git library decision	Documented	Decision in wiki or PR description
Apple provider sub	Status known	Documented (verified or flagged as unavailable)
Billing alarm	Active	Check AWS Budgets console

Validation steps

Review benchmarks. Read Dev/Phase 0 — EFS Benchmarks wiki note. Are the numbers acceptable? Do they leave headroom for the full Otterwiki app (which will be heavier than the PoC)?
Test MCP yourself. Open Claude.ai, connect to the PoC MCP endpoint. Call the echo tool. Verify the OAuth flow is smooth enough for end users.
Check costs. Review the AWS bill or Cost Explorer. Is the dev stack cost in line with expectations (~$0.50/mo baseline)?
Review decisions. Read the git library decision (gitpython vs. dulwich). Does the rationale make sense?
Check the Pulumi state. Run pulumi stack to verify the dev stack is clean and all resources are tracked.

Known risks to evaluate

VPC cold starts: If > 5s, consider Provisioned Concurrency (~$10-15/mo for 1 warm instance). Is the cost acceptable?
EFS latency variance: NFS latency can spike under load. Are the P95 numbers acceptable, not just averages?
WorkOS quirks: Any unexpected behavior in the OAuth flow? Token lifetimes? Refresh behavior?

Go/no-go decision

Go if all targets met and no surprising cost or latency issues
No-go if EFS latency is unacceptable → evaluate Fly.io fallback (PRD section: Alternatives Considered)
No-go if MCP OAuth doesn't work → investigate alternative auth providers or debug WorkOS integration

Phase 1 Gate: Single-User Serverless Wiki

Exit criteria

Criterion	Target	How to verify
Web UI works	Pages load, edit, save	Browse `dev.wikibot.io`
REST API works	All endpoints respond correctly	Run integration tests
MCP works	All 12 tools functional	Connect Claude.ai or Claude Code, exercise tools
Semantic search works	Returns relevant results	Search for a concept, verify results
Git history	Correct authorship per write path	Check git log on EFS
Routing + TLS	All endpoints on custom domain with valid cert	Browser + curl
Architecture decision	Same vs. separate Lambda for MCP	Documented with rationale

Validation steps

Browse the wiki. Go to dev.wikibot.io. Create a page with WikiLinks. Verify the web UI is responsive and functional.

Test the API. Run the integration test suite or manually curl a few endpoints:

curl -H "Authorization: Bearer $KEY" https://dev.wikibot.io/api/v1/pages
curl -H "Authorization: Bearer $KEY" https://dev.wikibot.io/api/v1/search?q=test

Test MCP. Connect Claude.ai to dev.wikibot.io/mcp. Exercise read_note, write_note, search_notes, semantic_search. Verify results are correct.
Check semantic search quality. Write a few test pages, then search for concepts using different wording. Are the results relevant?
Check git authorship. Pages created via web UI should show one author; pages via API/MCP should show the configured API author. Verify in git log.
Performance sanity check. Is the web UI snappy enough? Do API calls return in < 1s? Is MCP responsive?

Known risks to evaluate

Mangum compatibility: Any Flask features that don't work under Mangum? (Sessions, file uploads, streaming responses)
FAISS index persistence: Does the index survive Lambda recycling? Is it loaded fast enough on cold start?
Lambda package size: Is the deployment package under 250MB (zip) or 10GB (container)? If too large, container images may be needed.

Go/no-go decision

Go if all features work and performance is acceptable
Partial go if semantic search has issues → defer to Phase 5 (it's a premium feature anyway)
No-go if core wiki functionality is broken or too slow

Phase 2 Gate: Multi-Tenancy and Auth

Exit criteria

Criterion	Target	How to verify
Multi-user auth	Two users can log in independently	Test with two accounts
Wiki isolation	User A cannot access User B's private wiki	ACL enforcement test
Management API	All endpoints work	Integration tests
ACL enforcement	All roles enforced correctly	E2E test (P2-10)
Public wikis	Anonymous read access works	Test without auth
CLI tool	All commands work	Run each command
Bootstrap template	New wikis initialized correctly	Create wiki, inspect pages
Admin panel hiding	Disabled sections hidden and return 404	Browse admin as owner
PROXY_HEADER auth	All permission levels work	Test each role
Username handling	Validation, uniqueness, reserved names	Attempt invalid usernames

Validation steps

Create two test accounts. Sign up as two different users (two Google accounts or Google + GitHub).
Test isolation. User A creates a private wiki. Log in as User B. Verify User B cannot see, list, or access User A's wiki via web UI, API, or MCP.
Test ACLs. User A grants User B editor access. Verify User B can now read and write. User A revokes. Verify 403.
Test public wiki. User A makes a wiki public. Open in incognito (no auth). Verify read-only access.

Test the CLI. Run through the full CLI workflow:

wiki create my-wiki "My Wiki"
wiki list
wiki token my-wiki
wiki grant my-wiki friend@example.com editor
wiki revoke my-wiki friend@example.com
wiki delete my-wiki

Inspect bootstrap template. Create a new wiki, then read the Home page and Wiki Usage Guide. Are they clear and complete?
Test admin panel. Log in as wiki owner, go to admin panel. Verify only Application Preferences, Sidebar Preferences, and Content and Editing are visible. Verify disabled routes return 404.
Test tier limits. As a free user, try to create a second wiki. Try to add a 4th collaborator. Verify clear error messages.

Known risks to evaluate

DynamoDB latency: ACL checks add a DynamoDB read to every request. Is the added latency acceptable? Should we cache in Lambda memory?
WorkOS token lifetimes: How long do MCP OAuth tokens last before refresh? Does Claude.ai handle refresh correctly?
Username squatting: Not a launch concern, but are the reserved name checks in place?

Go/no-go decision

Go if multi-tenancy works correctly and securely
No-go if tenant isolation has any gaps — this is a security requirement, not a feature

Phase 3 Gate: Frontend

Exit criteria

Criterion	Target	How to verify
SPA loads	Dashboard accessible after login	Browser test
Auth flow	Login, logout, token refresh	Test each flow
Wiki CRUD	Create, view, delete wiki from UI	Browser test
Collaborator management	Invite, change role, revoke from UI	Browser test
MCP instructions	Correct, copyable command	Verify command works
Public wiki toggle	Works from settings UI	Toggle and verify
Static hosting	SPA served via CloudFront	Check response headers
Mobile responsive	Usable on phone	Browser dev tools or real device

Validation steps

Full user journey. In a fresh incognito window:
- Visit wikibot.io → see landing/login
- Log in with Google → land on dashboard
- Create a wiki → see token and instructions
- Copy claude mcp add command → paste in terminal → verify MCP connects
- Go to wiki settings → invite a collaborator, toggle public
- Log out → verify redirected to login
Test on mobile. Open wikibot.io on a phone. Is the dashboard usable? Can you create a wiki?
Check error states. What happens when the API is down? When you enter an invalid wiki slug? When you try to create a wiki and hit the tier limit?
Performance. How fast does the SPA load? Is the bundle size reasonable (< 500KB gzipped)?

Known risks to evaluate

Framework choice: Does the chosen framework (React or Svelte) feel right? Any regrets?
Auth UX: Is the login flow smooth? Any confusing redirects or error messages?
MCP instructions clarity: Would a new user understand how to connect? Test with fresh eyes.

Go/no-go decision

Go if the user journey works end-to-end and the UX is acceptable
Go with issues if cosmetic issues remain — they can be fixed post-launch
No-go if auth flow or core wiki management is broken

Phase 4 Gate: Launch Readiness

Exit criteria

Criterion	Target	How to verify
Git clone	`git clone` works for authorized users	Test from command line
Git auth	Bearer token authentication works	Clone with token
WAF active	Rate limiting and OWASP rules	Check WAF console, test rate limit
Monitoring	Dashboard shows traffic, alarms configured	CloudWatch console
Backups	EFS backup running, DynamoDB PITR active	AWS Backup console, DynamoDB settings
Backup restore	Tested at least once	Restore test documented
Landing page	Loads, explains product, has CTA	Browser test
Docs	Getting started, MCP setup documented	Read through docs

Validation steps

Test git clone. Create a wiki, write a few pages via MCP, then:
```
git clone https://token:<bearer>@<user>.wikibot.io/<wiki>.git
```
Verify the repo contains the expected pages.
Test rate limiting. Hit an endpoint rapidly (> 100 requests/minute). Verify WAF blocks with 429 or 403. Verify normal usage is not affected.
Review monitoring. Look at the CloudWatch dashboard. Does it show the traffic from your testing? Are all panels populated?
Test backup restore. Restore an EFS backup snapshot. Verify the git repo is intact. This can be a throwaway test — restore to a new filesystem, mount it, inspect, delete.
Review landing page. Read it as if you've never heard of wikibot.io. Does it explain the product clearly? Does the CTA lead to signup?
Read the docs. Follow the "Getting Started" guide from scratch. Does it work?
Security review. Check:
- No sensitive data in public repos or frontend bundle
- API keys rotated from development values
- WAF rules active
- No open security group rules
- All endpoints require auth (except public wikis and landing page)
Cost review. Check AWS bill. Is the total in line with projections? Any unexpected charges?

Final checklist before launch

All Phase 0–4 exit criteria met
No critical bugs in the issue tracker
Backup restore tested successfully
Monitoring alarms tested (at least one alarm fired and notified)
Landing page and docs reviewed
DNS configured for production domain (wikibot.io)
Production Pulumi stack deployed (separate from dev)
Production secrets rotated from dev values
WAF active on production
WorkOS configured for production domain

Go/no-go decision

Go if all checklist items pass
Soft launch if minor issues remain — launch to a small group, fix in production
No-go if security issues, data loss risks, or fundamental UX problems remain

On this page

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9