Blame
|
1 | --- |
||||||
|
2 | status: current |
||||||
| 3 | platform: VPS (robot.wtf) |
|||||||
| 4 | --- |
|||||||
| 5 | > Extracted from the original wikibot.io design. AWS-specific content archived at [[Archive/AWS_Design/Operations]]. |
|||||||
|
6 | |||||||
|
7 | See also: [[Design/Platform_Overview]], [[Design/Data_Model]], [[Design/Auth]], [[Design/VPS_Architecture]]. |
||||||
|
8 | |||||||
| 9 | --- |
|||||||
| 10 | ||||||||
| 11 | ## Wiki Bootstrap Template |
|||||||
| 12 | ||||||||
|
13 | When a user creates a new wiki, the repo is initialized with a starter page set that teaches Claude how to use the wiki effectively. The user connects MCP, starts a conversation, and Claude already knows the conventions. |
||||||
|
14 | |||||||
| 15 | ### Initial pages |
|||||||
| 16 | ||||||||
| 17 | **Home** — Landing page with the wiki's name and purpose (user-provided at creation), links to the guide and any starter pages. |
|||||||
| 18 | ||||||||
| 19 | **Meta/Wiki Usage Guide** — Instructions for the AI assistant: |
|||||||
| 20 | - Available MCP tools and what they do |
|||||||
| 21 | - Session start protocol (read Home first, then check recent changes) |
|||||||
| 22 | - Page conventions: frontmatter schema, WikiLink syntax, page size guidance (~250–800 words) |
|||||||
| 23 | - Commit message format |
|||||||
| 24 | - When to create new pages vs. update existing ones |
|||||||
| 25 | - How to use categories and tags |
|||||||
| 26 | - Gardening responsibilities (orphan detection, stale page review, link maintenance) |
|||||||
| 27 | ||||||||
| 28 | **Meta/Page Template** — A reference page showing the frontmatter schema, section structure, and WikiLink usage. Claude can copy this pattern when creating new pages. |
|||||||
| 29 | ||||||||
| 30 | ### Customization |
|||||||
| 31 | ||||||||
| 32 | The bootstrap template is parameterized by: |
|||||||
| 33 | - Wiki name (provided at creation) |
|||||||
| 34 | - Wiki purpose/description (optional, provided at creation) |
|||||||
| 35 | - Category set (default set provided, user can customize later) |
|||||||
| 36 | ||||||||
| 37 | The default category set matches the existing schema (`actor`, `event`, `trend`, `hypothesis`, `variable`, `reference`, `index`) but users can define their own categories for different research domains. |
|||||||
| 38 | ||||||||
| 39 | ### Custom template repos (premium) |
|||||||
| 40 | ||||||||
|
41 | Premium users can create a wiki from any public (or authenticated) Git repo URL. The server clones the template repo, strips its git history, and commits the contents as the wiki's initial state. This enables: |
||||||
|
42 | |||||||
| 43 | - Shared team templates ("our standard research wiki layout") |
|||||||
| 44 | - Domain-specific starter kits (e.g., a policy analysis template, a technical due diligence template) |
|||||||
| 45 | - Community-contributed templates (a future marketplace opportunity) |
|||||||
| 46 | ||||||||
| 47 | --- |
|||||||
| 48 | ||||||||
| 49 | ## Attachment Storage |
|||||||
| 50 | ||||||||
|
51 | Otterwiki stores attachments as regular files in the git repo and serves them directly from the working tree. |
||||||
|
52 | |||||||
| 53 | ### MVP approach |
|||||||
| 54 | ||||||||
|
55 | Store attachments in the git repo as-is. Tier limits (50MB free, 1GB premium) keep repo sizes manageable. |
||||||
|
56 | |||||||
|
57 | ### Future optimization: external attachment storage |
||||||
|
58 | |||||||
|
59 | If large attachments become a problem (disk usage, Git remote clone times), decouple attachment storage from the git repo: |
||||||
|
60 | |||||||
|
61 | 1. On upload: store the attachment externally at a known path, commit only a lightweight reference file to git (similar to Git LFS pointer format) |
||||||
| 62 | 2. On serve: intercept Otterwiki's attachment serving path, resolve the reference, and serve from external storage |
|||||||
|
63 | |||||||
| 64 | This could be implemented as: |
|||||||
| 65 | - **Otterwiki plugin** that hooks into the attachment upload/serve lifecycle |
|||||||
|
66 | - **Upstream patch** to Otterwiki adding a pluggable storage backend for attachments (local filesystem vs. external) |
||||||
|
67 | |||||||
| 68 | The plugin or upstream patch approach is preferable — it benefits the broader Otterwiki community and keeps our fork minimal. |
|||||||
| 69 | ||||||||
| 70 | --- |
|||||||
| 71 | ||||||||
| 72 | ## Git Remote Access |
|||||||
| 73 | ||||||||
| 74 | Every wiki's bare repo is directly accessible via Git protocol over HTTPS. This is a core feature, not an afterthought — users should never feel locked in. |
|||||||
| 75 | ||||||||
| 76 | ### Hosted Git remote |
|||||||
| 77 | ||||||||
| 78 | ``` |
|||||||
|
79 | https://sderle.robot.wtf/third-gulf-war.git |
||||||
|
80 | ``` |
||||||
| 81 | ||||||||
| 82 | Authentication: OAuth JWT or MCP bearer token via Git credential helper, or a dedicated Git access token (simpler for CLI usage). |
|||||||
| 83 | ||||||||
| 84 | **Free tier**: read-only. Users can `git clone` and `git pull` their wiki at any time. This is a data portability guarantee — your wiki is always yours. |
|||||||
| 85 | ||||||||
| 86 | **Premium tier**: read/write. Users can `git push` to the hosted remote, enabling workflows like local editing, CI/CD integration, or scripted bulk imports. |
|||||||
| 87 | ||||||||
| 88 | ### Implementation |
|||||||
| 89 | ||||||||
|
90 | HTTP route (`/{user}/{wiki}.git/*`) served by the app implementing Git smart HTTP protocol (`git-upload-pack` for clone/fetch, `git-receive-pack` for push). Accesses the same on-disk repo as the wiki handlers. |
||||||
|
91 | |||||||
| 92 | ### External Git sync (premium, future) |
|||||||
| 93 | ||||||||
|
94 | Bidirectional sync with an external remote (GitHub, GitLab, etc.). Triggered on schedule or webhook: |
||||||
|
95 | |||||||
|
96 | 1. Open wiki repo |
||||||
|
97 | 2. `git fetch` from configured external remote |
||||||
| 98 | 3. Attempt fast-forward merge (no conflicts → auto-merge) |
|||||||
| 99 | 4. Conflicts → flag for human resolution, do not auto-merge |
|||||||
| 100 | 5. Push merged state to external remote |
|||||||
| 101 | 6. Trigger re-embedding if semantic search enabled |
|||||||
| 102 | ||||||||
| 103 | --- |
|||||||
| 104 | ||||||||
| 105 | ## Otterwiki Fork Management |
|||||||
| 106 | ||||||||
| 107 | The Otterwiki fork is kept as minimal as possible. All customizations are either: |
|||||||
| 108 | 1. **Plugins** (preferred) — no core changes needed |
|||||||
| 109 | 2. **Small, upstreamable patches** — contributed to `schuyler/otterwiki` and submitted as PRs to the upstream `redimp/otterwiki` project |
|||||||
| 110 | 3. **Platform-specific overrides** — admin panel section hiding, template conditionals (kept in a separate branch or patch set) |
|||||||
| 111 | ||||||||
| 112 | ### Merge strategy |
|||||||
| 113 | ||||||||
| 114 | - Track upstream `redimp/otterwiki` as a remote |
|||||||
| 115 | - Periodically rebase or merge upstream changes into the fork |
|||||||
| 116 | - Keep platform-specific changes isolated (ideally a thin layer on top, not interleaved with upstream code) |
|||||||
| 117 | - Automated CI check: does the fork still pass upstream's test suite after merge? |
|||||||
| 118 | ||||||||
| 119 | ### Upstream relationship |
|||||||
| 120 | ||||||||
| 121 | We want to support Otterwiki as a project. Contributions go upstream where possible. If the product generates revenue, donate a portion to the upstream maintainer. |
|||||||
| 122 | ||||||||
| 123 | --- |
|||||||
| 124 | ||||||||
| 125 | ## Backup and Disaster Recovery |
|||||||
| 126 | ||||||||
| 127 | ### What we're protecting |
|||||||
| 128 | ||||||||
| 129 | | Data | Source of truth | Severity of loss | |
|||||||
| 130 | |------|----------------|-----------------| |
|||||||
|
131 | | Git repos (wiki content) | Local disk | **Critical** — user data, irreplaceable | |
||||||
| 132 | | Platform DB (users, wikis, ACLs) | SQLite/PostgreSQL | **High** — reconstructable from repos but painful | |
|||||||
| 133 | | FAISS indexes | Local disk | **Low** — fully rebuildable from repo content | |
|||||||
|
134 | | Auth provider state | WorkOS (external) | **Low** — managed by vendor | |
||||||
| 135 | ||||||||
| 136 | ### Backup strategy |
|||||||
| 137 | ||||||||
|
138 | **Git repos**: rsync to off-site storage, daily. See [[Design/VPS_Architecture]] for specifics. |
||||||
|
139 | |||||||
|
140 | **Platform DB**: Daily dump + rsync. Point-in-time recovery if using PostgreSQL. |
||||||
|
141 | |||||||
|
142 | **FAISS indexes**: No backup needed. Rebuildable from repo content (MiniLM runs locally, no API cost). |
||||||
|
143 | |||||||
| 144 | ### Design principle |
|||||||
| 145 | ||||||||
|
146 | Git repos are the source of truth. Everything else (platform DB records, FAISS indexes) is either backed up independently or rebuildable from the repos. |
||||||
|
147 | |||||||
| 148 | --- |
|||||||
| 149 | ||||||||
| 150 | ## Account Lifecycle |
|||||||
| 151 | ||||||||
| 152 | ### Data retention |
|||||||
| 153 | ||||||||
|
154 | User accounts and wiki data are retained indefinitely regardless of activity. Storage cost for an idle wiki is effectively zero. There is no reason to delete inactive accounts — it costs nothing to keep them and deleting user data is irreversible. |
||||||
|
155 | |||||||
| 156 | ### Account deletion |
|||||||
| 157 | ||||||||
| 158 | Users can delete their account from the dashboard. This: |
|||||||
| 159 | 1. Deletes all wikis owned by the user (repo, FAISS index, metadata) |
|||||||
| 160 | 2. Removes all ACL grants the user has on other wikis |
|||||||
|
161 | 3. Deletes the user record from the platform DB |
||||||
|
162 | 4. Does NOT delete the auth provider account (Google/GitHub/etc.) — that's the user's own account |
||||||
| 163 | ||||||||
| 164 | Deletion is permanent and irreversible. Require explicit confirmation ("type your username to confirm"). |
|||||||
| 165 | ||||||||
| 166 | ### GDPR |
|||||||
| 167 | ||||||||
| 168 | If serving EU users: account deletion satisfies right-to-erasure. Add a data export endpoint (download all wikis as a zip of git repos) to satisfy right-to-portability — though the Git remote access feature already provides this. |
|||||||
| 169 | ||||||||
| 170 | --- |
|||||||
| 171 | ||||||||
| 172 | ## MCP Discoverability |
|||||||
| 173 | ||||||||
| 174 | MCP tool descriptions must be self-documenting — any MCP-capable client (Claude, GPT, Gemini, open-source agents) should be able to use the wiki tools without reading external documentation. |
|||||||
| 175 | ||||||||
| 176 | Each tool's MCP description should include: |
|||||||
| 177 | - What it does |
|||||||
| 178 | - Parameter semantics (e.g., "path is like `Actors/Iran`, not a filesystem path") |
|||||||
| 179 | - What the return format looks like |
|||||||
| 180 | - Common next actions ("use `list_notes` to find available pages if you don't know the path") |
|||||||
| 181 | ||||||||
| 182 | The bootstrap template's Meta/Wiki Usage Guide provides Claude-specific conventions (session protocol, gardening duties), but the MCP tools themselves should work without it. The guide is optimization, not a prerequisite. |
|||||||
| 183 | ||||||||
| 184 | --- |
|||||||
| 185 | ||||||||
| 186 | ## Rate Limiting and Abuse Prevention |
|||||||
| 187 | ||||||||
| 188 | **Launch**: OAuth-only accounts + tier limits (1 wiki, 500 pages, 3 collaborators) provide sufficient abuse prevention at low traffic. Public wiki routes are the only unauthenticated surface — acceptable risk at launch with near-zero users. |
|||||||
| 189 | ||||||||
|
190 | **Post-launch (when traffic justifies it)**: IP-based rate limiting via reverse proxy (nginx/Caddy). Geographic blocking, bot control, OWASP Top 10 rule sets via WAF or application-level middleware. |
||||||
|
191 | |||||||
|
192 | **Per-user rate limiting (premium launch)**: When premium tier ships, add per-user throttling on API and MCP endpoints. Define specific limits when the need materializes. |
||||||
