Commit 883754

2026-03-14 16:30:37 Claude (MCP): [mcp] Design/CDN_Read_Path: options for decoupling read path from Otterwiki Lambda
/dev/null .. Design/CDN_Read_Path.md
@@ 0,0 1,220 @@
+ ---
+ category: reference
+ tags: [design, performance, cdn, caching, architecture]
+ last_updated: 2026-03-14
+ confidence: medium
+ ---
+
+ # CDN Read Path Architecture
+
+ **Status:** Proposal — evaluating options
+ **Relates to:** [[Tasks/Emergent]] (E-1, E-2), [[Dev/E-1_Cold_Start_Benchmarks]], [[Design/Platform_Overview]], [[Design/Operations]]
+
+ ## Problem
+
+ Otterwiki is a traditional Flask/WSGI application. Importing `otterwiki.server` takes ~3.5s due to Flask app factory initialization, SQLAlchemy model creation, Jinja2 template loading, pluggy hook execution, and transitive imports (numpy, faiss, lxml, Pillow). See [[Dev/E-1_Cold_Start_Benchmarks]] for full breakdown.
+
+ This makes the Otterwiki Lambda unsuitable for serving browser reads. A cold start produces a 4.5–5.7s page load, well past the 2.5s "good" LCP threshold. VPC/EFS overhead is negligible (~80ms init); the bottleneck is Python module initialization and cannot be meaningfully reduced by memory scaling or lazy imports alone.
+
+ Meanwhile, wiki pages are written infrequently (during Claude MCP sessions) and read far more often (browsing, reference, sharing). The read path should not depend on the heavy Lambda.
+
+ ## Constraints
+
+ - Zero (or near-zero) cost at rest
+ - No new CDN provider (stay on CloudFront)
+ - Private wikis must enforce auth before serving content
+ - MCP and API write paths continue to use the existing VPC Lambda
+ - Solution must work with Otterwiki's existing templates and UI (sidebar navigation, page chrome, CSS/JS)
+
+ ## Fragment Model (common to all options below)
+
+ On every write (via MCP or API), the warm Otterwiki Lambda renders and stores HTML fragments in S3:
+
+ - **Content fragment** (`/fragments/{user}/{wiki}/pages/{Page_Path}.html`) — the rendered markdown for that page, inside Otterwiki's content `<div>`. Updated only when that specific page is edited.
+ - **Sidebar fragment** (`/fragments/{user}/{wiki}/sidebar.html`) — the wiki navigation tree. Updated on page create, delete, or rename. NOT updated on content-only edits.
+ - **Shell template** (`/fragments/{user}/{wiki}/shell.html`) — the page chrome (header, footer, CSS/JS links, layout). Updated on Otterwiki settings changes (theme, sidebar preferences) or on deploy. Changes rarely.
+
+ Each write produces 1–2 S3 PUTs: always the content fragment, plus the sidebar fragment if the page list changed. This scales to any wiki size — a 500-page wiki still only touches 1–2 objects per write.
+
+ ### Open question for Claude Code
+
+ Can Otterwiki's Jinja2 templates be invoked to render these fragments in isolation? Specifically:
+
+ 1. Can the page content area be rendered to an HTML fragment without a full Flask request context? (Markdown → HTML with Otterwiki's rendering pipeline, including wiki links, syntax highlighting, etc.)
+ 2. Can the sidebar/navigation partial be rendered independently given a page list from the git tree or DynamoDB?
+ 3. What does the shell template need? Is it a static HTML wrapper, or does it depend on per-request context (e.g., user name in header, edit button visibility)?
+ 4. Does the CSS/JS depend on the page content (e.g., conditional asset loading), or is it uniform across all pages?
+
+ ## Option A: Thin Assembly Lambda (Recommended)
+
+ ### Architecture
+
+ ```
+ Browser → CloudFront → [cache miss] → Assembly Lambda (non-VPC) → S3 fragments
+ → [cache hit] → cached HTML
+ ```
+
+ A lightweight Lambda (no Flask, no Otterwiki, no VPC) serves as the CloudFront origin for browser reads. On cache miss:
+
+ 1. Parse the request path to determine user, wiki, and page
+ 2. Fetch sidebar fragment + content fragment + shell template from S3 (3 `GetObject` calls, parallelized)
+ 3. String-substitute the fragments into the shell template
+ 4. Return assembled HTML with `Cache-Control: public, max-age=30`
+
+ CloudFront caches the assembled response. Subsequent reads within the TTL never touch any Lambda.
+
+ ### Auth
+
+ CloudFront Functions on viewer-request validates JWT (cookie or header) before the request reaches the assembly Lambda or cache. Public wikis skip validation. See [[Tasks/Emergent]] E-2 for auth design details.
+
+ ### Performance
+
+ - Cold start: sub-100ms. Non-VPC Lambda with only stdlib + boto3 (included in runtime). The E-1 benchmarks show `bare_vpc` at 88ms init with VPC overhead; without VPC this would be faster.
+ - Warm invocation: single-digit ms for 3 parallel S3 reads + string substitution
+ - Cache hit: ~10–50ms (edge latency)
+ - First page load after idle: ~100–300ms (Lambda cold start + S3 reads). Acceptable.
+
+ ### Advantages
+
+ - Complete HTML served on every request — no client-side rendering, no JS dependency
+ - Works with search engines, curl, accessibility tools, etc.
+ - Cold start is negligible
+ - Zero cost at rest (Lambda scales to zero, S3 storage is pennies)
+ - Fragment cache granularity means minimal S3 writes per wiki edit
+ - Assembly Lambda is trivial to implement and test (~50 lines of code)
+
+ ### Disadvantages
+
+ - Three S3 reads per cache miss (though parallelized and fast)
+ - Shell template must be kept in sync with Otterwiki's actual template output
+ - Any Otterwiki template change (theme update, layout change) requires re-rendering the shell fragment
+ - Adds a new Lambda function to manage/deploy
+
+ ### Variant: S3 as direct CloudFront origin
+
+ If the shell template is truly static and the sidebar + content can be pre-assembled into a single HTML file on write, the assembly Lambda is unnecessary — CloudFront serves directly from S3. This only works if we accept the full-page re-render cost: every sidebar-changing write must re-render all pages. For a 500-page wiki this is likely too expensive (see Scaling section below).
+
+ ## Option B: Hybrid Static Content + Async Sidebar
+
+ ### Architecture
+
+ ```
+ Browser → CloudFront → S3 (full page HTML, no sidebar)
+ → JS fetches sidebar fragment from CDN
+ ```
+
+ The write Lambda renders full page HTML (chrome + content, no sidebar) and stashes it directly in S3. That's the CloudFront origin. The sidebar loads asynchronously via a small inline `<script>` that fetches the sidebar fragment from the CDN and injects it into the DOM.
+
+ ### Auth
+
+ Same CloudFront Functions JWT validation as Option A.
+
+ ### Performance
+
+ - Content visible on first paint (no Lambda, pure CDN → S3)
+ - Sidebar appears shortly after (second CDN fetch, likely <50ms)
+ - Visual flash as sidebar loads — may be imperceptible on fast connections
+
+ ### Advantages
+
+ - No assembly Lambda at all — simplest server-side architecture
+ - Full page content is visible immediately; sidebar is progressive enhancement
+ - Each page is a single S3 object; sidebar is one shared object per wiki
+
+ ### Disadvantages
+
+ - Requires JavaScript to render the sidebar — wiki is partially non-functional without JS
+ - Brief flash of missing sidebar on initial load
+ - Otterwiki's CSS layout must tolerate a missing sidebar during load (may need CSS adjustment)
+ - Two HTTP requests per page load instead of one (content + sidebar), though both are CDN hits
+
+ ### Open question for Claude Code
+
+ Does Otterwiki's CSS layout handle a sidebar that isn't present in the initial HTML? Is the sidebar rendered into the page template server-side, or is there already a container that could be populated client-side?
+
+ ## Option C: Pre-Rendered Full Pages (Static Site Generator)
+
+ ### Architecture
+
+ ```
+ Browser → CloudFront → S3 (complete pre-rendered pages)
+ Write → Otterwiki Lambda → render ALL pages → S3
+ ```
+
+ On every write, the warm Otterwiki Lambda renders every page in the wiki to complete HTML (chrome + sidebar + content) and uploads them all to S3.
+
+ ### Auth
+
+ Same CloudFront Functions JWT validation.
+
+ ### Performance
+
+ - Reads are pure CDN → S3 — fastest possible, no Lambda at all
+ - Write cost scales with wiki size: rendering N pages on every write
+
+ ### Scaling problem
+
+ This doesn't scale. A sidebar-changing write (page create, delete, rename) requires re-rendering every page because the sidebar is embedded in each one. For a 500-page wiki, that's 500 Jinja2 renders + 500 S3 PUTs on every create/delete/rename. During an active Claude session that creates several pages in sequence, you'd queue up multiple full-site rebuilds. At SaaS scale with many active wikis, this multiplies across tenants.
+
+ Content-only edits (the majority) would only re-render one page — but the worst case drives the architecture.
+
+ ### Advantages
+
+ - Simplest read path — pure static files, no Lambda, no assembly
+ - Complete HTML, no JS dependency
+
+ ### Disadvantages
+
+ - Write cost proportional to wiki size — doesn't scale
+ - Rebuild queueing during active sessions
+ - Burns Lambda compute on re-rendering pages that didn't change
+
+ ## Option D: Client-Side SPA Assembly
+
+ ### Architecture
+
+ ```
+ Browser → CloudFront → static HTML shell (SPA)
+ → JS fetches content + sidebar as JSON/HTML fragments from CDN
+ ```
+
+ The wiki becomes a single-page app. A static shell loads once; JavaScript fetches page content and sidebar as fragments from the CDN and renders them client-side.
+
+ ### Advantages
+
+ - Maximum cache granularity — fragments cached independently
+ - Shell template cached indefinitely (content-hashed)
+ - Navigation between pages doesn't require full page reload
+
+ ### Disadvantages
+
+ - Requires JavaScript for all functionality
+ - Not accessible without JS; poor for SEO (if wikis are public)
+ - Significant departure from Otterwiki's server-rendered model
+ - Would require building a new frontend rather than leveraging Otterwiki's templates
+
+ ## Comparison
+
+ | | Cold start | Cache hit | JS required | Write cost | Complexity | Scales |
+ |---|---|---|---|---|---|---|
+ | **A: Assembly Lambda** | ~100–300ms | ~10–50ms | No | 1–2 S3 PUTs | Medium | Yes |
+ | **B: Hybrid + async sidebar** | None (pure CDN) | ~10–50ms | Sidebar only | 1–2 S3 PUTs | Low | Yes |
+ | **C: Full pre-render** | None (pure CDN) | ~10–50ms | No | N S3 PUTs (N=pages) | Low | **No** |
+ | **D: SPA** | None (pure CDN) | ~10–50ms | Yes (all) | 1–2 S3 PUTs | High | Yes |
+
+ ## Recommendation
+
+ **Option A (Thin Assembly Lambda)** provides the best balance: complete HTML with no JS dependency, negligible cold starts, per-fragment caching, and it scales. The assembly Lambda is trivially simple and adds minimal operational overhead.
+
+ **Option B (Hybrid)** is the simplest fallback if the sidebar can be loaded asynchronously without UI disruption.
+
+ Both options depend on whether Otterwiki's templates can produce fragments in isolation. This is the key question for Claude Code evaluation.
+
+ ## Cost Impact
+
+ All options preserve the zero-cost-at-rest model:
+
+ - S3 fragment storage: pennies per wiki (a few KB per page)
+ - CloudFront: free tier covers light traffic
+ - Assembly Lambda (Option A): scales to zero, invoked only on cache misses
+ - No provisioned concurrency needed anywhere in the read path
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9