Blame

aceac6 Claude (MCP) 2026-03-13 19:28:30
[mcp] Add Tasks/Emergent for cold start re-instrumentation and CDN caching design
1
---
2
category: reference
3
tags: [tasks, emergent]
522140 Claude (MCP) 2026-03-14 18:25:28
[mcp] Add E-4 (Lambda Library Mode), update E-1/E-2 status
4
last_updated: 2026-03-14
aceac6 Claude (MCP) 2026-03-13 19:28:30
[mcp] Add Tasks/Emergent for cold start re-instrumentation and CDN caching design
5
confidence: high
6
---
7
8
## How to read this document
9
10
- Emergent tasks arise during development but don't belong to a specific phase
11
- They may block, inform, or optimize work in any phase
12
- Tasks are numbered `E-{sequence}`
13
- Priority indicates urgency relative to current phase work
14
15
---
16
17
## Emergent Tasks
18
19
### E-1: Re-Instrument Cold Start INIT Phase
20
21
**Priority:** High — blocks architectural decisions about cold start mitigation
22
**Discovered during:** Phase 2 (CDN caching design discussion)
23
**Relates to:** [[Dev/Phase_0_EFS_Benchmarks]], [[Design/Platform_Overview]]
24
25
**Context:**
26
Phase 0 benchmarks measured cold starts at ~3,400ms and attributed ~2,400ms to "VPC ENI attach." However, AWS Hyperplane ENI (shipped 2019) pre-creates network interfaces at function creation time, not at invocation time. Current documentation and third-party benchmarks consistently report VPC overhead under 50–100ms for properly configured functions. The 2,400ms attribution is almost certainly incorrect.
27
28
The actual cold start time is likely dominated by Python package initialization: loading dulwich, Flask/Otterwiki, Mangum, aws-xray-sdk, and all transitive dependencies from a 39MB deployment package. Possibly also EFS mount negotiation (NFS/TLS handshake to mount target). Without accurate instrumentation, any cold start mitigation strategy (provisioned concurrency, architecture changes, package optimization) is a guess.
29
522140 Claude (MCP) 2026-03-14 18:25:28
[mcp] Add E-4 (Lambda Library Mode), update E-1/E-2 status
30
**Status:** COMPLETE — see [[Dev/E-1_Cold_Start_Benchmarks]] for results. VPC overhead confirmed negligible (~80-90ms). `import otterwiki.server` is 79% of cold start (~3.5s). Architectural mitigation designed in [[Design/Lambda_Library_Mode]].
31
aceac6 Claude (MCP) 2026-03-13 19:28:30
[mcp] Add Tasks/Emergent for cold start re-instrumentation and CDN caching design
32
**Task:**
33
Re-run cold start benchmarks with fine-grained tracing of the INIT phase. Break down time spent in:
34
35
1. VPC/ENI setup (should be negligible with Hyperplane)
36
2. EFS mount negotiation
37
3. Python runtime startup
38
4. Module imports (dulwich, Flask, Mangum, Otterwiki, aws-xray-sdk)
39
5. Application initialization (framework setup, config loading)
40
41
Use X-Ray subsegments or manual timing around import blocks and init steps. Compare with a minimal VPC Lambda (no EFS, no heavy imports) as a control.
42
43
**Deliverables:**
44
- Updated [[Dev/Phase_0_EFS_Benchmarks]] with corrected INIT breakdown
45
- Identification of top 2–3 contributors to cold start latency
46
- Recommendation: package optimization, lazy imports, memory tuning, or architectural change
47
48
**Acceptance criteria:**
522140 Claude (MCP) 2026-03-14 18:25:28
[mcp] Add E-4 (Lambda Library Mode), update E-1/E-2 status
49
- [x] INIT phase broken into at least 4 measured segments
50
- [x] Each segment's contribution to total cold start quantified (ms and %)
51
- [x] Control Lambda (minimal VPC, no EFS) measured for baseline comparison
52
- [x] Benchmark page updated with corrected attribution
aceac6 Claude (MCP) 2026-03-13 19:28:30
[mcp] Add Tasks/Emergent for cold start re-instrumentation and CDN caching design
53
54
---
55
56
### E-2: CDN Caching Layer Design
57
58
**Priority:** Medium — improves page load UX independent of cold start fix
59
**Discovered during:** Phase 2 (page responsiveness discussion)
60
**Relates to:** [[Design/Platform_Overview]], [[Design/Operations]]
61
62
**Context:**
63
Wiki pages are written infrequently (during Claude sessions via MCP) and read much more often (browsing, reference). CloudFront is already in the architecture for static SPA hosting but is not used to cache wiki page content. Adding a caching layer for page reads would reduce most page loads from ~270ms (warm Lambda) to ~10–50ms (edge serve), and reduce origin load.
64
522140 Claude (MCP) 2026-03-14 18:25:28
[mcp] Add E-4 (Lambda Library Mode), update E-1/E-2 status
65
**Status:** Design complete — see [[Design/CDN_Read_Path]]. Option A (Thin Assembly Lambda) recommended. Implementation queued as [[Tasks/E-2_CDN_Read_Path]].
66
aceac6 Claude (MCP) 2026-03-13 19:28:30
[mcp] Add Tasks/Emergent for cold start re-instrumentation and CDN caching design
67
**Design decisions needed:**
68
69
**Cache freshness strategy:** Short TTL (30–60s) on page HTML with `Cache-Control` headers from the origin. No invalidation API calls under normal operation — pages self-expire. Static assets (CSS, JS, fonts) use content-hashed filenames with long TTLs (1 year). Invalidation reserved for exceptional cases (page deletion, privacy). This avoids the invalidation cost problem: at scale (e.g. 1,000 active wikis × 5 writes/day), path-based invalidation would exceed the 1,000/month free tier and cost ~$220/month, growing linearly with write volume.
70
71
**Auth-aware caching for private wikis:** Three options evaluated:
72
73
1. **CloudFront signed cookies** — most CloudFront-native; set after OAuth login, scoped to user's subdomain. CloudFront validates signature at edge before serving cached content. Signed cookie attributes are excluded from cache key, so all authenticated users of the same wiki share cached pages.
74
2. **CloudFront Functions with JWT validation** — lightweight JS function on viewer-request validates JWT at edge using built-in crypto module + KeyValueStore for public key. Sub-millisecond execution, no extra cost. Works well for RS256 if public key verification fits within execution constraints.
75
3. **Lambda@Edge** — most powerful, can do full OIDC flows, but heavier and more expensive. Overkill for token validation on cached content.
76
77
Recommended approach: CloudFront Functions (option 2) for auth validation + short-TTL cache for page content. Needs validation that RS256 signature verification runs within CloudFront Functions execution limits.
78
79
**MCP calls are not cached** — POST requests on a separate path pattern, always pass through to Lambda.
80
81
**Deliverables:**
82
- Design page: `Design/CDN_Caching`
83
- CloudFront Functions prototype for JWT validation (validate RS256 fits within execution constraints)
84
- Estimate of cache hit ratio for typical wiki usage patterns
85
86
**Acceptance criteria:**
87
- [ ] Design page documents cache strategy, auth approach, TTL rationale, and cost model
88
- [ ] CloudFront Functions JWT validation tested (RS256 performance confirmed or HS256 fallback documented)
89
- [ ] Cache behavior configuration specified for page content vs. static assets vs. MCP vs. API routes
34f118 Claude (MCP) 2026-03-14 17:51:52
[mcp] E-3: Add emergent task for client-side encryption / zero-knowledge storage investigation
90
91
---
92
93
### E-3: Client-Side Encryption / Zero-Knowledge Storage
94
95
**Priority:** Medium — not a launch blocker, but important for trust and the privacy story
96
**Discovered during:** Landing page copywriting (2026-03-14)
3fde12 Claude (MCP) 2026-03-14 19:18:05
[mcp] E-3: link to completed design spike, update status
97
**Status:** Design spike complete — see [[Design/E-3_Encryption_Spike]]. Recommendation: EFS encryption at rest + IAM audit logging for launch; per-user KMS deferred until storage model changes.
34f118 Claude (MCP) 2026-03-14 17:51:52
[mcp] E-3: Add emergent task for client-side encryption / zero-knowledge storage investigation
98
99
**Context:**
100
The current privacy claim is "your wiki is private by default" — but the operator (us) can still read the data at rest on EFS. For a product whose pitch is "memory for your agents," the data is inherently sensitive: it's the user's working notes, research, plans, and whatever their agents are writing on their behalf.
101
102
Ideally, wiki content would be encrypted client-side so that even the platform operator cannot read it. This is a hard problem for a wiki with a web UI and MCP access — both need to decrypt content to render/search it — but worth investigating.
103
104
**Areas to explore:**
105
- Client-side encryption with key derived from user credential (e.g. HKDF from OAuth token or user-supplied passphrase)
106
- Impact on web UI rendering (decrypt in browser via Web Crypto API?)
107
- Impact on MCP access (agent would need the key — how does that work?)
108
- Impact on semantic search (can't embed encrypted text — is search a premium-only feature anyway?)
109
- Impact on git clone (encrypted blobs in repo, decrypt locally?)
110
- Precedents: Standard Notes, Proton Drive, age-encrypted git repos
111
- Partial approaches: encrypt at rest with per-user KMS keys (operator can't casually surveil, but AWS access would still allow it)
112
113
**Deliverables:**
114
- Design spike: what's feasible, what's the UX impact, what are the tradeoffs
115
- Recommendation: full zero-knowledge, per-user KMS, or "encrypt at rest and be honest about the limits"
116
117
**Acceptance criteria:**
118
- [ ] At least three approaches evaluated with pros/cons
119
- [ ] UX impact documented for web UI, MCP, git clone, and search
120
- [ ] Recommendation with rationale
522140 Claude (MCP) 2026-03-14 18:25:28
[mcp] Add E-4 (Lambda Library Mode), update E-1/E-2 status
121
122
---
123
124
### E-4: Lambda Library Mode Implementation
125
126
**Priority:** Medium — reduces write-path cold start from ~4.5s to ~2.6s
127
**Discovered during:** Cold start deep dive (2026-03-14)
128
**Depends on:** [[Dev/E-1_Cold_Start_Benchmarks]] (complete), [[Design/Lambda_Library_Mode]]
129
**Relates to:** [[Design/CDN_Read_Path]], [[Design/Platform_Overview]]
130
131
**Context:**
132
The E-1 benchmarks identified `import otterwiki.server` as 79% of cold start (~3.5s). The Lambda Library Mode design ([[Design/Lambda_Library_Mode]]) proposes replacing `otterwiki.server` with a lazy-loading drop-in via `sys.modules` injection, plus upstream contributions to defer heavy imports in views.py and wiki.py.
133
134
**Task:**
135
136
**Fork work (lambda_server.py):**
137
1. Write `lambda_server.py` that exports `app`, `db`, `storage` (LocalProxy), `app_renderer` (LocalProxy), `mail` (LocalProxy), `githttpserver` (LocalProxy), model re-exports, template filters, Jinja globals
138
2. Inject via `sys.modules['otterwiki.server']` before any otterwiki imports in lambda_init.py
139
3. Add `@app.before_request` hook for deferred init (db.create_all, config, plugins, multi-tenant middleware)
140
4. Verify all existing E2E tests pass against the replacement module
141
142
**Upstream PRs:**
143
5. Lazy imports in `views.py` — move heavy imports into route handler function bodies
144
6. Lazy imports in `wiki.py` — PIL.Image, feedgen to function level
145
7. Extract plugin entrypoint scan to explicit `init_plugins()` function
146
8. Remove duplicate `render = OtterwikiRenderer()` in renderer.py:632
147
148
**Deliverables:**
149
- `lambda_server.py` in wikibot-io repo
150
- Updated `lambda_init.py` using injection pattern
151
- 4 upstream PRs to otterwiki (items 5-8)
152
- Updated cold start benchmarks showing improvement
153
154
**Acceptance criteria:**
155
- [ ] Lambda INIT < 1,600ms without upstream PRs
156
- [ ] Lambda INIT < 800ms with upstream PRs accepted
157
- [ ] All existing E2E tests pass
158
- [ ] First-request latency < 2,000ms (deferred init)
159
- [ ] Warm request latency unchanged
a57ea2 Claude (MCP) 2026-03-14 18:57:49
[mcp] [mcp] Add E-5 (.pyc retention) and E-6 (warming ping) emergent tasks
160
161
---
162
163
### E-5: Retain .pyc Files in Lambda Package
164
165
**Priority:** Low — estimated 200-400ms cold start reduction, low effort
166
**Discovered during:** Cold start mitigation discussion (2026-03-14)
167
**Relates to:** [[Dev/E-1_Cold_Start_Benchmarks]]
168
169
**Context:**
170
The Lambda build script (`app/otterwiki/build.sh`) strips all `.pyc` files and `__pycache__` directories to reduce package size. However, Lambda's package filesystem is read-only — Python cannot cache compiled bytecode at runtime. Every cold start recompiles every `.py` file it imports. Retaining pre-compiled `.pyc` files (or better, running `python -m compileall` during the build) skips the compilation step on cold start.
171
172
Estimated savings: 200-400ms. The bulk of import time is executing module-level code and loading C extensions, not bytecode compilation, so the improvement is modest. But the change is a one-liner with zero risk.
173
174
**Task:**
175
1. Remove the `find "$PACKAGE_DIR" -name "*.pyc" -delete` line from `build.sh`
176
2. Remove the `find "$PACKAGE_DIR" -type d -name "__pycache__" -exec rm -rf {} +` line
177
3. Add `python -m compileall -q "$PACKAGE_DIR"` after stripping to pre-compile all `.py` files
178
4. Measure package size delta
179
5. Re-run cold start benchmark to measure actual improvement
180
181
**Acceptance criteria:**
182
- [ ] `.pyc` files retained in Lambda package
183
- [ ] Cold start improvement measured and documented
184
- [ ] Package size increase documented (expected: ~10-20MB)
185
186
---
187
188
### E-6: Lambda Warming Ping
189
190
**Priority:** Low — stopgap measure, not a long-term solution
191
**Discovered during:** Cold start mitigation discussion (2026-03-14)
192
**Relates to:** [[Dev/E-1_Cold_Start_Benchmarks]], [[Design/CDN_Read_Path]]
193
194
**Context:**
195
An EventBridge rule invoking the otterwiki Lambda every 5 minutes keeps one execution environment warm, eliminating cold starts for the common case (single user browsing). Cost is effectively $0/month (8,760 invocations/month at 500ms × 512MB = 2,190 GB-seconds, well within the 400K GB-s free tier).
196
197
Limitations: only keeps 1 instance warm. Concurrent requests beyond 1 still cold-start. Does not solve the problem at scale — just masks it for low-traffic scenarios.
198
199
This is a band-aid, not a solution. The CDN read path ([[Tasks/E-2_CDN_Read_Path_ClientSide]]) is the proper fix. This task exists as a fallback if the CDN work is delayed.
200
201
**Task:**
202
1. Add EventBridge rule in `infra/__main__.py`: invoke otterwiki Lambda every 5 minutes
203
2. Add Lambda permission for EventBridge to invoke
204
3. The Lambda handler already handles non-API-Gateway events gracefully (returns early)
205
4. Verify warm state with benchmark
206
207
**Acceptance criteria:**
208
- [ ] EventBridge rule triggers Lambda every 5 minutes
209
- [ ] Lambda stays warm between invocations (no cold start on next real request)
210
- [ ] Cost: $0/month additional