Commit 883754
2026-03-14 16:30:37 Claude (MCP): [mcp] Design/CDN_Read_Path: options for decoupling read path from Otterwiki Lambda| /dev/null .. Design/CDN_Read_Path.md | |
| @@ 0,0 1,220 @@ | |
| + | --- |
| + | category: reference |
| + | tags: [design, performance, cdn, caching, architecture] |
| + | last_updated: 2026-03-14 |
| + | confidence: medium |
| + | --- |
| + | |
| + | # CDN Read Path Architecture |
| + | |
| + | **Status:** Proposal — evaluating options |
| + | **Relates to:** [[Tasks/Emergent]] (E-1, E-2), [[Dev/E-1_Cold_Start_Benchmarks]], [[Design/Platform_Overview]], [[Design/Operations]] |
| + | |
| + | ## Problem |
| + | |
| + | Otterwiki is a traditional Flask/WSGI application. Importing `otterwiki.server` takes ~3.5s due to Flask app factory initialization, SQLAlchemy model creation, Jinja2 template loading, pluggy hook execution, and transitive imports (numpy, faiss, lxml, Pillow). See [[Dev/E-1_Cold_Start_Benchmarks]] for full breakdown. |
| + | |
| + | This makes the Otterwiki Lambda unsuitable for serving browser reads. A cold start produces a 4.5–5.7s page load, well past the 2.5s "good" LCP threshold. VPC/EFS overhead is negligible (~80ms init); the bottleneck is Python module initialization and cannot be meaningfully reduced by memory scaling or lazy imports alone. |
| + | |
| + | Meanwhile, wiki pages are written infrequently (during Claude MCP sessions) and read far more often (browsing, reference, sharing). The read path should not depend on the heavy Lambda. |
| + | |
| + | ## Constraints |
| + | |
| + | - Zero (or near-zero) cost at rest |
| + | - No new CDN provider (stay on CloudFront) |
| + | - Private wikis must enforce auth before serving content |
| + | - MCP and API write paths continue to use the existing VPC Lambda |
| + | - Solution must work with Otterwiki's existing templates and UI (sidebar navigation, page chrome, CSS/JS) |
| + | |
| + | ## Fragment Model (common to all options below) |
| + | |
| + | On every write (via MCP or API), the warm Otterwiki Lambda renders and stores HTML fragments in S3: |
| + | |
| + | - **Content fragment** (`/fragments/{user}/{wiki}/pages/{Page_Path}.html`) — the rendered markdown for that page, inside Otterwiki's content `<div>`. Updated only when that specific page is edited. |
| + | - **Sidebar fragment** (`/fragments/{user}/{wiki}/sidebar.html`) — the wiki navigation tree. Updated on page create, delete, or rename. NOT updated on content-only edits. |
| + | - **Shell template** (`/fragments/{user}/{wiki}/shell.html`) — the page chrome (header, footer, CSS/JS links, layout). Updated on Otterwiki settings changes (theme, sidebar preferences) or on deploy. Changes rarely. |
| + | |
| + | Each write produces 1–2 S3 PUTs: always the content fragment, plus the sidebar fragment if the page list changed. This scales to any wiki size — a 500-page wiki still only touches 1–2 objects per write. |
| + | |
| + | ### Open question for Claude Code |
| + | |
| + | Can Otterwiki's Jinja2 templates be invoked to render these fragments in isolation? Specifically: |
| + | |
| + | 1. Can the page content area be rendered to an HTML fragment without a full Flask request context? (Markdown → HTML with Otterwiki's rendering pipeline, including wiki links, syntax highlighting, etc.) |
| + | 2. Can the sidebar/navigation partial be rendered independently given a page list from the git tree or DynamoDB? |
| + | 3. What does the shell template need? Is it a static HTML wrapper, or does it depend on per-request context (e.g., user name in header, edit button visibility)? |
| + | 4. Does the CSS/JS depend on the page content (e.g., conditional asset loading), or is it uniform across all pages? |
| + | |
| + | ## Option A: Thin Assembly Lambda (Recommended) |
| + | |
| + | ### Architecture |
| + | |
| + | ``` |
| + | Browser → CloudFront → [cache miss] → Assembly Lambda (non-VPC) → S3 fragments |
| + | → [cache hit] → cached HTML |
| + | ``` |
| + | |
| + | A lightweight Lambda (no Flask, no Otterwiki, no VPC) serves as the CloudFront origin for browser reads. On cache miss: |
| + | |
| + | 1. Parse the request path to determine user, wiki, and page |
| + | 2. Fetch sidebar fragment + content fragment + shell template from S3 (3 `GetObject` calls, parallelized) |
| + | 3. String-substitute the fragments into the shell template |
| + | 4. Return assembled HTML with `Cache-Control: public, max-age=30` |
| + | |
| + | CloudFront caches the assembled response. Subsequent reads within the TTL never touch any Lambda. |
| + | |
| + | ### Auth |
| + | |
| + | CloudFront Functions on viewer-request validates JWT (cookie or header) before the request reaches the assembly Lambda or cache. Public wikis skip validation. See [[Tasks/Emergent]] E-2 for auth design details. |
| + | |
| + | ### Performance |
| + | |
| + | - Cold start: sub-100ms. Non-VPC Lambda with only stdlib + boto3 (included in runtime). The E-1 benchmarks show `bare_vpc` at 88ms init with VPC overhead; without VPC this would be faster. |
| + | - Warm invocation: single-digit ms for 3 parallel S3 reads + string substitution |
| + | - Cache hit: ~10–50ms (edge latency) |
| + | - First page load after idle: ~100–300ms (Lambda cold start + S3 reads). Acceptable. |
| + | |
| + | ### Advantages |
| + | |
| + | - Complete HTML served on every request — no client-side rendering, no JS dependency |
| + | - Works with search engines, curl, accessibility tools, etc. |
| + | - Cold start is negligible |
| + | - Zero cost at rest (Lambda scales to zero, S3 storage is pennies) |
| + | - Fragment cache granularity means minimal S3 writes per wiki edit |
| + | - Assembly Lambda is trivial to implement and test (~50 lines of code) |
| + | |
| + | ### Disadvantages |
| + | |
| + | - Three S3 reads per cache miss (though parallelized and fast) |
| + | - Shell template must be kept in sync with Otterwiki's actual template output |
| + | - Any Otterwiki template change (theme update, layout change) requires re-rendering the shell fragment |
| + | - Adds a new Lambda function to manage/deploy |
| + | |
| + | ### Variant: S3 as direct CloudFront origin |
| + | |
| + | If the shell template is truly static and the sidebar + content can be pre-assembled into a single HTML file on write, the assembly Lambda is unnecessary — CloudFront serves directly from S3. This only works if we accept the full-page re-render cost: every sidebar-changing write must re-render all pages. For a 500-page wiki this is likely too expensive (see Scaling section below). |
| + | |
| + | ## Option B: Hybrid Static Content + Async Sidebar |
| + | |
| + | ### Architecture |
| + | |
| + | ``` |
| + | Browser → CloudFront → S3 (full page HTML, no sidebar) |
| + | → JS fetches sidebar fragment from CDN |
| + | ``` |
| + | |
| + | The write Lambda renders full page HTML (chrome + content, no sidebar) and stashes it directly in S3. That's the CloudFront origin. The sidebar loads asynchronously via a small inline `<script>` that fetches the sidebar fragment from the CDN and injects it into the DOM. |
| + | |
| + | ### Auth |
| + | |
| + | Same CloudFront Functions JWT validation as Option A. |
| + | |
| + | ### Performance |
| + | |
| + | - Content visible on first paint (no Lambda, pure CDN → S3) |
| + | - Sidebar appears shortly after (second CDN fetch, likely <50ms) |
| + | - Visual flash as sidebar loads — may be imperceptible on fast connections |
| + | |
| + | ### Advantages |
| + | |
| + | - No assembly Lambda at all — simplest server-side architecture |
| + | - Full page content is visible immediately; sidebar is progressive enhancement |
| + | - Each page is a single S3 object; sidebar is one shared object per wiki |
| + | |
| + | ### Disadvantages |
| + | |
| + | - Requires JavaScript to render the sidebar — wiki is partially non-functional without JS |
| + | - Brief flash of missing sidebar on initial load |
| + | - Otterwiki's CSS layout must tolerate a missing sidebar during load (may need CSS adjustment) |
| + | - Two HTTP requests per page load instead of one (content + sidebar), though both are CDN hits |
| + | |
| + | ### Open question for Claude Code |
| + | |
| + | Does Otterwiki's CSS layout handle a sidebar that isn't present in the initial HTML? Is the sidebar rendered into the page template server-side, or is there already a container that could be populated client-side? |
| + | |
| + | ## Option C: Pre-Rendered Full Pages (Static Site Generator) |
| + | |
| + | ### Architecture |
| + | |
| + | ``` |
| + | Browser → CloudFront → S3 (complete pre-rendered pages) |
| + | Write → Otterwiki Lambda → render ALL pages → S3 |
| + | ``` |
| + | |
| + | On every write, the warm Otterwiki Lambda renders every page in the wiki to complete HTML (chrome + sidebar + content) and uploads them all to S3. |
| + | |
| + | ### Auth |
| + | |
| + | Same CloudFront Functions JWT validation. |
| + | |
| + | ### Performance |
| + | |
| + | - Reads are pure CDN → S3 — fastest possible, no Lambda at all |
| + | - Write cost scales with wiki size: rendering N pages on every write |
| + | |
| + | ### Scaling problem |
| + | |
| + | This doesn't scale. A sidebar-changing write (page create, delete, rename) requires re-rendering every page because the sidebar is embedded in each one. For a 500-page wiki, that's 500 Jinja2 renders + 500 S3 PUTs on every create/delete/rename. During an active Claude session that creates several pages in sequence, you'd queue up multiple full-site rebuilds. At SaaS scale with many active wikis, this multiplies across tenants. |
| + | |
| + | Content-only edits (the majority) would only re-render one page — but the worst case drives the architecture. |
| + | |
| + | ### Advantages |
| + | |
| + | - Simplest read path — pure static files, no Lambda, no assembly |
| + | - Complete HTML, no JS dependency |
| + | |
| + | ### Disadvantages |
| + | |
| + | - Write cost proportional to wiki size — doesn't scale |
| + | - Rebuild queueing during active sessions |
| + | - Burns Lambda compute on re-rendering pages that didn't change |
| + | |
| + | ## Option D: Client-Side SPA Assembly |
| + | |
| + | ### Architecture |
| + | |
| + | ``` |
| + | Browser → CloudFront → static HTML shell (SPA) |
| + | → JS fetches content + sidebar as JSON/HTML fragments from CDN |
| + | ``` |
| + | |
| + | The wiki becomes a single-page app. A static shell loads once; JavaScript fetches page content and sidebar as fragments from the CDN and renders them client-side. |
| + | |
| + | ### Advantages |
| + | |
| + | - Maximum cache granularity — fragments cached independently |
| + | - Shell template cached indefinitely (content-hashed) |
| + | - Navigation between pages doesn't require full page reload |
| + | |
| + | ### Disadvantages |
| + | |
| + | - Requires JavaScript for all functionality |
| + | - Not accessible without JS; poor for SEO (if wikis are public) |
| + | - Significant departure from Otterwiki's server-rendered model |
| + | - Would require building a new frontend rather than leveraging Otterwiki's templates |
| + | |
| + | ## Comparison |
| + | |
| + | | | Cold start | Cache hit | JS required | Write cost | Complexity | Scales | |
| + | |---|---|---|---|---|---|---| |
| + | | **A: Assembly Lambda** | ~100–300ms | ~10–50ms | No | 1–2 S3 PUTs | Medium | Yes | |
| + | | **B: Hybrid + async sidebar** | None (pure CDN) | ~10–50ms | Sidebar only | 1–2 S3 PUTs | Low | Yes | |
| + | | **C: Full pre-render** | None (pure CDN) | ~10–50ms | No | N S3 PUTs (N=pages) | Low | **No** | |
| + | | **D: SPA** | None (pure CDN) | ~10–50ms | Yes (all) | 1–2 S3 PUTs | High | Yes | |
| + | |
| + | ## Recommendation |
| + | |
| + | **Option A (Thin Assembly Lambda)** provides the best balance: complete HTML with no JS dependency, negligible cold starts, per-fragment caching, and it scales. The assembly Lambda is trivially simple and adds minimal operational overhead. |
| + | |
| + | **Option B (Hybrid)** is the simplest fallback if the sidebar can be loaded asynchronously without UI disruption. |
| + | |
| + | Both options depend on whether Otterwiki's templates can produce fragments in isolation. This is the key question for Claude Code evaluation. |
| + | |
| + | ## Cost Impact |
| + | |
| + | All options preserve the zero-cost-at-rest model: |
| + | |
| + | - S3 fragment storage: pennies per wiki (a few KB per page) |
| + | - CloudFront: free tier covers light traffic |
| + | - Assembly Lambda (Option A): scales to zero, invoked only on cache misses |
| + | - No provisioned concurrency needed anywhere in the read path |