Blame
|
1 | --- |
||||||
| 2 | category: reference |
|||||||
| 3 | tags: [design, performance, cdn, caching, architecture] |
|||||||
| 4 | last_updated: 2026-03-14 |
|||||||
| 5 | confidence: medium |
|||||||
| 6 | --- |
|||||||
| 7 | ||||||||
| 8 | # CDN Read Path Architecture |
|||||||
| 9 | ||||||||
|
10 | > **Superseded.** This page addresses Lambda cold start latency for browser reads — a problem that doesn't exist on a VPS with persistent Gunicorn processes. See [[Design/VPS_Architecture]] for the current plan. The fragment caching concepts could be revisited if read performance becomes a concern, but the specific AWS/CloudFront/S3 implementation is not applicable. |
||||||
| 11 | ||||||||
|
12 | **Status:** Proposal — evaluating options |
||||||
| 13 | **Relates to:** [[Tasks/Emergent]] (E-1, E-2), [[Dev/E-1_Cold_Start_Benchmarks]], [[Design/Platform_Overview]], [[Design/Operations]] |
|||||||
| 14 | ||||||||
| 15 | ## Problem |
|||||||
| 16 | ||||||||
| 17 | Otterwiki is a traditional Flask/WSGI application. Importing `otterwiki.server` takes ~3.5s due to Flask app factory initialization, SQLAlchemy model creation, Jinja2 template loading, pluggy hook execution, and transitive imports (numpy, faiss, lxml, Pillow). See [[Dev/E-1_Cold_Start_Benchmarks]] for full breakdown. |
|||||||
| 18 | ||||||||
| 19 | This makes the Otterwiki Lambda unsuitable for serving browser reads. A cold start produces a 4.5–5.7s page load, well past the 2.5s "good" LCP threshold. VPC/EFS overhead is negligible (~80ms init); the bottleneck is Python module initialization and cannot be meaningfully reduced by memory scaling or lazy imports alone. |
|||||||
| 20 | ||||||||
| 21 | Meanwhile, wiki pages are written infrequently (during Claude MCP sessions) and read far more often (browsing, reference, sharing). The read path should not depend on the heavy Lambda. |
|||||||
| 22 | ||||||||
| 23 | ## Constraints |
|||||||
| 24 | ||||||||
| 25 | - Zero (or near-zero) cost at rest |
|||||||
| 26 | - No new CDN provider (stay on CloudFront) |
|||||||
| 27 | - Private wikis must enforce auth before serving content |
|||||||
| 28 | - MCP and API write paths continue to use the existing VPC Lambda |
|||||||
| 29 | - Solution must work with Otterwiki's existing templates and UI (sidebar navigation, page chrome, CSS/JS) |
|||||||
| 30 | ||||||||
| 31 | ## Fragment Model (common to all options below) |
|||||||
| 32 | ||||||||
| 33 | On every write (via MCP or API), the warm Otterwiki Lambda renders and stores HTML fragments in S3: |
|||||||
| 34 | ||||||||
| 35 | - **Content fragment** (`/fragments/{user}/{wiki}/pages/{Page_Path}.html`) — the rendered markdown for that page, inside Otterwiki's content `<div>`. Updated only when that specific page is edited. |
|||||||
| 36 | - **Sidebar fragment** (`/fragments/{user}/{wiki}/sidebar.html`) — the wiki navigation tree. Updated on page create, delete, or rename. NOT updated on content-only edits. |
|||||||
| 37 | - **Shell template** (`/fragments/{user}/{wiki}/shell.html`) — the page chrome (header, footer, CSS/JS links, layout). Updated on Otterwiki settings changes (theme, sidebar preferences) or on deploy. Changes rarely. |
|||||||
| 38 | ||||||||
| 39 | Each write produces 1–2 S3 PUTs: always the content fragment, plus the sidebar fragment if the page list changed. This scales to any wiki size — a 500-page wiki still only touches 1–2 objects per write. |
|||||||
| 40 | ||||||||
| 41 | ### Open question for Claude Code |
|||||||
| 42 | ||||||||
| 43 | Can Otterwiki's Jinja2 templates be invoked to render these fragments in isolation? Specifically: |
|||||||
| 44 | ||||||||
| 45 | 1. Can the page content area be rendered to an HTML fragment without a full Flask request context? (Markdown → HTML with Otterwiki's rendering pipeline, including wiki links, syntax highlighting, etc.) |
|||||||
| 46 | 2. Can the sidebar/navigation partial be rendered independently given a page list from the git tree or DynamoDB? |
|||||||
| 47 | 3. What does the shell template need? Is it a static HTML wrapper, or does it depend on per-request context (e.g., user name in header, edit button visibility)? |
|||||||
| 48 | 4. Does the CSS/JS depend on the page content (e.g., conditional asset loading), or is it uniform across all pages? |
|||||||
| 49 | ||||||||
| 50 | ## Option A: Thin Assembly Lambda (Recommended) |
|||||||
| 51 | ||||||||
| 52 | ### Architecture |
|||||||
| 53 | ||||||||
| 54 | ``` |
|||||||
| 55 | Browser → CloudFront → [cache miss] → Assembly Lambda (non-VPC) → S3 fragments |
|||||||
| 56 | → [cache hit] → cached HTML |
|||||||
| 57 | ``` |
|||||||
| 58 | ||||||||
| 59 | A lightweight Lambda (no Flask, no Otterwiki, no VPC) serves as the CloudFront origin for browser reads. On cache miss: |
|||||||
| 60 | ||||||||
| 61 | 1. Parse the request path to determine user, wiki, and page |
|||||||
| 62 | 2. Fetch sidebar fragment + content fragment + shell template from S3 (3 `GetObject` calls, parallelized) |
|||||||
| 63 | 3. String-substitute the fragments into the shell template |
|||||||
| 64 | 4. Return assembled HTML with `Cache-Control: public, max-age=30` |
|||||||
| 65 | ||||||||
| 66 | CloudFront caches the assembled response. Subsequent reads within the TTL never touch any Lambda. |
|||||||
| 67 | ||||||||
| 68 | ### Auth |
|||||||
| 69 | ||||||||
| 70 | CloudFront Functions on viewer-request validates JWT (cookie or header) before the request reaches the assembly Lambda or cache. Public wikis skip validation. See [[Tasks/Emergent]] E-2 for auth design details. |
|||||||
| 71 | ||||||||
| 72 | ### Performance |
|||||||
| 73 | ||||||||
| 74 | - Cold start: sub-100ms. Non-VPC Lambda with only stdlib + boto3 (included in runtime). The E-1 benchmarks show `bare_vpc` at 88ms init with VPC overhead; without VPC this would be faster. |
|||||||
| 75 | - Warm invocation: single-digit ms for 3 parallel S3 reads + string substitution |
|||||||
| 76 | - Cache hit: ~10–50ms (edge latency) |
|||||||
| 77 | - First page load after idle: ~100–300ms (Lambda cold start + S3 reads). Acceptable. |
|||||||
| 78 | ||||||||
| 79 | ### Advantages |
|||||||
| 80 | ||||||||
| 81 | - Complete HTML served on every request — no client-side rendering, no JS dependency |
|||||||
| 82 | - Works with search engines, curl, accessibility tools, etc. |
|||||||
| 83 | - Cold start is negligible |
|||||||
| 84 | - Zero cost at rest (Lambda scales to zero, S3 storage is pennies) |
|||||||
| 85 | - Fragment cache granularity means minimal S3 writes per wiki edit |
|||||||
| 86 | - Assembly Lambda is trivial to implement and test (~50 lines of code) |
|||||||
| 87 | ||||||||
| 88 | ### Disadvantages |
|||||||
| 89 | ||||||||
| 90 | - Three S3 reads per cache miss (though parallelized and fast) |
|||||||
| 91 | - Shell template must be kept in sync with Otterwiki's actual template output |
|||||||
| 92 | - Any Otterwiki template change (theme update, layout change) requires re-rendering the shell fragment |
|||||||
| 93 | - Adds a new Lambda function to manage/deploy |
|||||||
| 94 | ||||||||
| 95 | ### Variant: S3 as direct CloudFront origin |
|||||||
| 96 | ||||||||
| 97 | If the shell template is truly static and the sidebar + content can be pre-assembled into a single HTML file on write, the assembly Lambda is unnecessary — CloudFront serves directly from S3. This only works if we accept the full-page re-render cost: every sidebar-changing write must re-render all pages. For a 500-page wiki this is likely too expensive (see Scaling section below). |
|||||||
| 98 | ||||||||
| 99 | ## Option B: Hybrid Static Content + Async Sidebar |
|||||||
| 100 | ||||||||
| 101 | ### Architecture |
|||||||
| 102 | ||||||||
| 103 | ``` |
|||||||
| 104 | Browser → CloudFront → S3 (full page HTML, no sidebar) |
|||||||
| 105 | → JS fetches sidebar fragment from CDN |
|||||||
| 106 | ``` |
|||||||
| 107 | ||||||||
| 108 | The write Lambda renders full page HTML (chrome + content, no sidebar) and stashes it directly in S3. That's the CloudFront origin. The sidebar loads asynchronously via a small inline `<script>` that fetches the sidebar fragment from the CDN and injects it into the DOM. |
|||||||
| 109 | ||||||||
| 110 | ### Auth |
|||||||
| 111 | ||||||||
| 112 | Same CloudFront Functions JWT validation as Option A. |
|||||||
| 113 | ||||||||
| 114 | ### Performance |
|||||||
| 115 | ||||||||
| 116 | - Content visible on first paint (no Lambda, pure CDN → S3) |
|||||||
| 117 | - Sidebar appears shortly after (second CDN fetch, likely <50ms) |
|||||||
| 118 | - Visual flash as sidebar loads — may be imperceptible on fast connections |
|||||||
| 119 | ||||||||
| 120 | ### Advantages |
|||||||
| 121 | ||||||||
| 122 | - No assembly Lambda at all — simplest server-side architecture |
|||||||
| 123 | - Full page content is visible immediately; sidebar is progressive enhancement |
|||||||
| 124 | - Each page is a single S3 object; sidebar is one shared object per wiki |
|||||||
| 125 | ||||||||
| 126 | ### Disadvantages |
|||||||
| 127 | ||||||||
| 128 | - Requires JavaScript to render the sidebar — wiki is partially non-functional without JS |
|||||||
| 129 | - Brief flash of missing sidebar on initial load |
|||||||
| 130 | - Otterwiki's CSS layout must tolerate a missing sidebar during load (may need CSS adjustment) |
|||||||
| 131 | - Two HTTP requests per page load instead of one (content + sidebar), though both are CDN hits |
|||||||
| 132 | ||||||||
| 133 | ### Open question for Claude Code |
|||||||
| 134 | ||||||||
| 135 | Does Otterwiki's CSS layout handle a sidebar that isn't present in the initial HTML? Is the sidebar rendered into the page template server-side, or is there already a container that could be populated client-side? |
|||||||
| 136 | ||||||||
| 137 | ## Option C: Pre-Rendered Full Pages (Static Site Generator) |
|||||||
| 138 | ||||||||
| 139 | ### Architecture |
|||||||
| 140 | ||||||||
| 141 | ``` |
|||||||
| 142 | Browser → CloudFront → S3 (complete pre-rendered pages) |
|||||||
| 143 | Write → Otterwiki Lambda → render ALL pages → S3 |
|||||||
| 144 | ``` |
|||||||
| 145 | ||||||||
| 146 | On every write, the warm Otterwiki Lambda renders every page in the wiki to complete HTML (chrome + sidebar + content) and uploads them all to S3. |
|||||||
| 147 | ||||||||
| 148 | ### Auth |
|||||||
| 149 | ||||||||
| 150 | Same CloudFront Functions JWT validation. |
|||||||
| 151 | ||||||||
| 152 | ### Performance |
|||||||
| 153 | ||||||||
| 154 | - Reads are pure CDN → S3 — fastest possible, no Lambda at all |
|||||||
| 155 | - Write cost scales with wiki size: rendering N pages on every write |
|||||||
| 156 | ||||||||
| 157 | ### Scaling problem |
|||||||
| 158 | ||||||||
| 159 | This doesn't scale. A sidebar-changing write (page create, delete, rename) requires re-rendering every page because the sidebar is embedded in each one. For a 500-page wiki, that's 500 Jinja2 renders + 500 S3 PUTs on every create/delete/rename. During an active Claude session that creates several pages in sequence, you'd queue up multiple full-site rebuilds. At SaaS scale with many active wikis, this multiplies across tenants. |
|||||||
| 160 | ||||||||
| 161 | Content-only edits (the majority) would only re-render one page — but the worst case drives the architecture. |
|||||||
| 162 | ||||||||
| 163 | ### Advantages |
|||||||
| 164 | ||||||||
| 165 | - Simplest read path — pure static files, no Lambda, no assembly |
|||||||
| 166 | - Complete HTML, no JS dependency |
|||||||
| 167 | ||||||||
| 168 | ### Disadvantages |
|||||||
| 169 | ||||||||
| 170 | - Write cost proportional to wiki size — doesn't scale |
|||||||
| 171 | - Rebuild queueing during active sessions |
|||||||
| 172 | - Burns Lambda compute on re-rendering pages that didn't change |
|||||||
| 173 | ||||||||
| 174 | ## Option D: Client-Side SPA Assembly |
|||||||
| 175 | ||||||||
| 176 | ### Architecture |
|||||||
| 177 | ||||||||
| 178 | ``` |
|||||||
| 179 | Browser → CloudFront → static HTML shell (SPA) |
|||||||
| 180 | → JS fetches content + sidebar as JSON/HTML fragments from CDN |
|||||||
| 181 | ``` |
|||||||
| 182 | ||||||||
| 183 | The wiki becomes a single-page app. A static shell loads once; JavaScript fetches page content and sidebar as fragments from the CDN and renders them client-side. |
|||||||
| 184 | ||||||||
| 185 | ### Advantages |
|||||||
| 186 | ||||||||
| 187 | - Maximum cache granularity — fragments cached independently |
|||||||
| 188 | - Shell template cached indefinitely (content-hashed) |
|||||||
| 189 | - Navigation between pages doesn't require full page reload |
|||||||
| 190 | ||||||||
| 191 | ### Disadvantages |
|||||||
| 192 | ||||||||
| 193 | - Requires JavaScript for all functionality |
|||||||
| 194 | - Not accessible without JS; poor for SEO (if wikis are public) |
|||||||
| 195 | - Significant departure from Otterwiki's server-rendered model |
|||||||
| 196 | - Would require building a new frontend rather than leveraging Otterwiki's templates |
|||||||
| 197 | ||||||||
| 198 | ## Comparison |
|||||||
| 199 | ||||||||
| 200 | | | Cold start | Cache hit | JS required | Write cost | Complexity | Scales | |
|||||||
| 201 | |---|---|---|---|---|---|---| |
|||||||
| 202 | | **A: Assembly Lambda** | ~100–300ms | ~10–50ms | No | 1–2 S3 PUTs | Medium | Yes | |
|||||||
| 203 | | **B: Hybrid + async sidebar** | None (pure CDN) | ~10–50ms | Sidebar only | 1–2 S3 PUTs | Low | Yes | |
|||||||
| 204 | | **C: Full pre-render** | None (pure CDN) | ~10–50ms | No | N S3 PUTs (N=pages) | Low | **No** | |
|||||||
| 205 | | **D: SPA** | None (pure CDN) | ~10–50ms | Yes (all) | 1–2 S3 PUTs | High | Yes | |
|||||||
| 206 | ||||||||
| 207 | ## Recommendation |
|||||||
| 208 | ||||||||
| 209 | **Option A (Thin Assembly Lambda)** provides the best balance: complete HTML with no JS dependency, negligible cold starts, per-fragment caching, and it scales. The assembly Lambda is trivially simple and adds minimal operational overhead. |
|||||||
| 210 | ||||||||
| 211 | **Option B (Hybrid)** is the simplest fallback if the sidebar can be loaded asynchronously without UI disruption. |
|||||||
| 212 | ||||||||
| 213 | Both options depend on whether Otterwiki's templates can produce fragments in isolation. This is the key question for Claude Code evaluation. |
|||||||
| 214 | ||||||||
| 215 | ## Cost Impact |
|||||||
| 216 | ||||||||
| 217 | All options preserve the zero-cost-at-rest model: |
|||||||
| 218 | ||||||||
| 219 | - S3 fragment storage: pennies per wiki (a few KB per page) |
|||||||
| 220 | - CloudFront: free tier covers light traffic |
|||||||
| 221 | - Assembly Lambda (Option A): scales to zero, invoked only on cache misses |
|||||||
| 222 | - No provisioned concurrency needed anywhere in the read path |
|||||||