--- category: reference tags: [design, performance, cdn, caching, architecture] last_updated: 2026-03-14 confidence: medium --- # CDN Read Path Architecture > **Superseded.** This page addresses Lambda cold start latency for browser reads — a problem that doesn't exist on a VPS with persistent Gunicorn processes. See [[Design/VPS_Architecture]] for the current plan. The fragment caching concepts could be revisited if read performance becomes a concern, but the specific AWS/CloudFront/S3 implementation is not applicable. **Status:** Proposal — evaluating options **Relates to:** [[Tasks/Emergent]] (E-1, E-2), [[Dev/E-1_Cold_Start_Benchmarks]], [[Design/Platform_Overview]], [[Design/Operations]] ## Problem Otterwiki is a traditional Flask/WSGI application. Importing `otterwiki.server` takes ~3.5s due to Flask app factory initialization, SQLAlchemy model creation, Jinja2 template loading, pluggy hook execution, and transitive imports (numpy, faiss, lxml, Pillow). See [[Dev/E-1_Cold_Start_Benchmarks]] for full breakdown. This makes the Otterwiki Lambda unsuitable for serving browser reads. A cold start produces a 4.5–5.7s page load, well past the 2.5s "good" LCP threshold. VPC/EFS overhead is negligible (~80ms init); the bottleneck is Python module initialization and cannot be meaningfully reduced by memory scaling or lazy imports alone. Meanwhile, wiki pages are written infrequently (during Claude MCP sessions) and read far more often (browsing, reference, sharing). The read path should not depend on the heavy Lambda. ## Constraints - Zero (or near-zero) cost at rest - No new CDN provider (stay on CloudFront) - Private wikis must enforce auth before serving content - MCP and API write paths continue to use the existing VPC Lambda - Solution must work with Otterwiki's existing templates and UI (sidebar navigation, page chrome, CSS/JS) ## Fragment Model (common to all options below) On every write (via MCP or API), the warm Otterwiki Lambda renders and stores HTML fragments in S3: - **Content fragment** (`/fragments/{user}/{wiki}/pages/{Page_Path}.html`) — the rendered markdown for that page, inside Otterwiki's content `<div>`. Updated only when that specific page is edited. - **Sidebar fragment** (`/fragments/{user}/{wiki}/sidebar.html`) — the wiki navigation tree. Updated on page create, delete, or rename. NOT updated on content-only edits. - **Shell template** (`/fragments/{user}/{wiki}/shell.html`) — the page chrome (header, footer, CSS/JS links, layout). Updated on Otterwiki settings changes (theme, sidebar preferences) or on deploy. Changes rarely. Each write produces 1–2 S3 PUTs: always the content fragment, plus the sidebar fragment if the page list changed. This scales to any wiki size — a 500-page wiki still only touches 1–2 objects per write. ### Open question for Claude Code Can Otterwiki's Jinja2 templates be invoked to render these fragments in isolation? Specifically: 1. Can the page content area be rendered to an HTML fragment without a full Flask request context? (Markdown → HTML with Otterwiki's rendering pipeline, including wiki links, syntax highlighting, etc.) 2. Can the sidebar/navigation partial be rendered independently given a page list from the git tree or DynamoDB? 3. What does the shell template need? Is it a static HTML wrapper, or does it depend on per-request context (e.g., user name in header, edit button visibility)? 4. Does the CSS/JS depend on the page content (e.g., conditional asset loading), or is it uniform across all pages? ## Option A: Thin Assembly Lambda (Recommended) ### Architecture ``` Browser → CloudFront → [cache miss] → Assembly Lambda (non-VPC) → S3 fragments → [cache hit] → cached HTML ``` A lightweight Lambda (no Flask, no Otterwiki, no VPC) serves as the CloudFront origin for browser reads. On cache miss: 1. Parse the request path to determine user, wiki, and page 2. Fetch sidebar fragment + content fragment + shell template from S3 (3 `GetObject` calls, parallelized) 3. String-substitute the fragments into the shell template 4. Return assembled HTML with `Cache-Control: public, max-age=30` CloudFront caches the assembled response. Subsequent reads within the TTL never touch any Lambda. ### Auth CloudFront Functions on viewer-request validates JWT (cookie or header) before the request reaches the assembly Lambda or cache. Public wikis skip validation. See [[Tasks/Emergent]] E-2 for auth design details. ### Performance - Cold start: sub-100ms. Non-VPC Lambda with only stdlib + boto3 (included in runtime). The E-1 benchmarks show `bare_vpc` at 88ms init with VPC overhead; without VPC this would be faster. - Warm invocation: single-digit ms for 3 parallel S3 reads + string substitution - Cache hit: ~10–50ms (edge latency) - First page load after idle: ~100–300ms (Lambda cold start + S3 reads). Acceptable. ### Advantages - Complete HTML served on every request — no client-side rendering, no JS dependency - Works with search engines, curl, accessibility tools, etc. - Cold start is negligible - Zero cost at rest (Lambda scales to zero, S3 storage is pennies) - Fragment cache granularity means minimal S3 writes per wiki edit - Assembly Lambda is trivial to implement and test (~50 lines of code) ### Disadvantages - Three S3 reads per cache miss (though parallelized and fast) - Shell template must be kept in sync with Otterwiki's actual template output - Any Otterwiki template change (theme update, layout change) requires re-rendering the shell fragment - Adds a new Lambda function to manage/deploy ### Variant: S3 as direct CloudFront origin If the shell template is truly static and the sidebar + content can be pre-assembled into a single HTML file on write, the assembly Lambda is unnecessary — CloudFront serves directly from S3. This only works if we accept the full-page re-render cost: every sidebar-changing write must re-render all pages. For a 500-page wiki this is likely too expensive (see Scaling section below). ## Option B: Hybrid Static Content + Async Sidebar ### Architecture ``` Browser → CloudFront → S3 (full page HTML, no sidebar) → JS fetches sidebar fragment from CDN ``` The write Lambda renders full page HTML (chrome + content, no sidebar) and stashes it directly in S3. That's the CloudFront origin. The sidebar loads asynchronously via a small inline `<script>` that fetches the sidebar fragment from the CDN and injects it into the DOM. ### Auth Same CloudFront Functions JWT validation as Option A. ### Performance - Content visible on first paint (no Lambda, pure CDN → S3) - Sidebar appears shortly after (second CDN fetch, likely <50ms) - Visual flash as sidebar loads — may be imperceptible on fast connections ### Advantages - No assembly Lambda at all — simplest server-side architecture - Full page content is visible immediately; sidebar is progressive enhancement - Each page is a single S3 object; sidebar is one shared object per wiki ### Disadvantages - Requires JavaScript to render the sidebar — wiki is partially non-functional without JS - Brief flash of missing sidebar on initial load - Otterwiki's CSS layout must tolerate a missing sidebar during load (may need CSS adjustment) - Two HTTP requests per page load instead of one (content + sidebar), though both are CDN hits ### Open question for Claude Code Does Otterwiki's CSS layout handle a sidebar that isn't present in the initial HTML? Is the sidebar rendered into the page template server-side, or is there already a container that could be populated client-side? ## Option C: Pre-Rendered Full Pages (Static Site Generator) ### Architecture ``` Browser → CloudFront → S3 (complete pre-rendered pages) Write → Otterwiki Lambda → render ALL pages → S3 ``` On every write, the warm Otterwiki Lambda renders every page in the wiki to complete HTML (chrome + sidebar + content) and uploads them all to S3. ### Auth Same CloudFront Functions JWT validation. ### Performance - Reads are pure CDN → S3 — fastest possible, no Lambda at all - Write cost scales with wiki size: rendering N pages on every write ### Scaling problem This doesn't scale. A sidebar-changing write (page create, delete, rename) requires re-rendering every page because the sidebar is embedded in each one. For a 500-page wiki, that's 500 Jinja2 renders + 500 S3 PUTs on every create/delete/rename. During an active Claude session that creates several pages in sequence, you'd queue up multiple full-site rebuilds. At SaaS scale with many active wikis, this multiplies across tenants. Content-only edits (the majority) would only re-render one page — but the worst case drives the architecture. ### Advantages - Simplest read path — pure static files, no Lambda, no assembly - Complete HTML, no JS dependency ### Disadvantages - Write cost proportional to wiki size — doesn't scale - Rebuild queueing during active sessions - Burns Lambda compute on re-rendering pages that didn't change ## Option D: Client-Side SPA Assembly ### Architecture ``` Browser → CloudFront → static HTML shell (SPA) → JS fetches content + sidebar as JSON/HTML fragments from CDN ``` The wiki becomes a single-page app. A static shell loads once; JavaScript fetches page content and sidebar as fragments from the CDN and renders them client-side. ### Advantages - Maximum cache granularity — fragments cached independently - Shell template cached indefinitely (content-hashed) - Navigation between pages doesn't require full page reload ### Disadvantages - Requires JavaScript for all functionality - Not accessible without JS; poor for SEO (if wikis are public) - Significant departure from Otterwiki's server-rendered model - Would require building a new frontend rather than leveraging Otterwiki's templates ## Comparison | | Cold start | Cache hit | JS required | Write cost | Complexity | Scales | |---|---|---|---|---|---|---| | **A: Assembly Lambda** | ~100–300ms | ~10–50ms | No | 1–2 S3 PUTs | Medium | Yes | | **B: Hybrid + async sidebar** | None (pure CDN) | ~10–50ms | Sidebar only | 1–2 S3 PUTs | Low | Yes | | **C: Full pre-render** | None (pure CDN) | ~10–50ms | No | N S3 PUTs (N=pages) | Low | **No** | | **D: SPA** | None (pure CDN) | ~10–50ms | Yes (all) | 1–2 S3 PUTs | High | Yes | ## Recommendation **Option A (Thin Assembly Lambda)** provides the best balance: complete HTML with no JS dependency, negligible cold starts, per-fragment caching, and it scales. The assembly Lambda is trivially simple and adds minimal operational overhead. **Option B (Hybrid)** is the simplest fallback if the sidebar can be loaded asynchronously without UI disruption. Both options depend on whether Otterwiki's templates can produce fragments in isolation. This is the key question for Claude Code evaluation. ## Cost Impact All options preserve the zero-cost-at-rest model: - S3 fragment storage: pennies per wiki (a few KB per page) - CloudFront: free tier covers light traffic - Assembly Lambda (Option A): scales to zero, invoked only on cache misses - No provisioned concurrency needed anywhere in the read path