Commit 3344a7
2026-03-14 18:24:41 Claude (MCP): [mcp] Add design page for Lambda Library Mode architecture| /dev/null .. Design/Lambda_Library_Mode.md | |
| @@ 0,0 1,285 @@ | |
| + | --- |
| + | category: reference |
| + | tags: [design, performance, lambda, architecture, cold-start] |
| + | last_updated: 2026-03-14 |
| + | confidence: medium |
| + | --- |
| + | |
| + | # Lambda Library Mode: Otterwiki as a Library |
| + | |
| + | **Status:** Research — queued for implementation after CDN Read Path |
| + | **Relates to:** [[Dev/E-1_Cold_Start_Benchmarks]], [[Design/CDN_Read_Path]], [[Design/Platform_Overview]] |
| + | |
| + | ## Problem |
| + | |
| + | Importing `otterwiki.server` takes ~3.5s on Lambda because `server.py` is a monolithic script that executes the entire application lifecycle at import time: Flask app creation, SQLAlchemy binding, git repo opening, plugin discovery, renderer initialization, DB DDL, config queries, and route registration with all transitive dependencies. See [[Dev/E-1_Cold_Start_Benchmarks]] for the full breakdown. |
| + | |
| + | The CDN Read Path ([[Design/CDN_Read_Path]]) solves the read-side cold start by bypassing the heavy Lambda entirely. This design addresses the **write-path cold start** (MCP, API) by restructuring how we load Otterwiki — treating it as a library of components rather than a monolithic application. |
| + | |
| + | ## Key Insight |
| + | |
| + | Every module in Otterwiki imports from `otterwiki.server`: |
| + | |
| + | ``` |
| + | otterwiki.models → from otterwiki.server import db |
| + | otterwiki.auth → from otterwiki.server import app, db |
| + | otterwiki.helper → from otterwiki.server import app, mail, storage, Preferences, db, app_renderer |
| + | otterwiki.wiki → from otterwiki.server import app, app_renderer, db, storage |
| + | otterwiki.views → from otterwiki.server import app, githttpserver |
| + | ``` |
| + | |
| + | If we provide our own module that exports the same names but initializes lazily, all of Otterwiki's business logic works unchanged — it just resolves `otterwiki.server` to our module. |
| + | |
| + | ## Architecture: `sys.modules` Injection |
| + | |
| + | Python's `sys.modules` dict controls what `import` returns. If we inject our own module before any Otterwiki code is imported, every `from otterwiki.server import X` resolves to our lazy version. |
| + | |
| + | ``` |
| + | lambda_init.py |
| + | └─ import lambda_server # ~300ms: creates Flask app + SQLAlchemy, injects sys.modules |
| + | └─ sys.modules['otterwiki.server'] = lambda_server |
| + | └─ import otterwiki.views # route registration only (with upstream lazy-import PRs) |
| + | └─ build multi-tenant middleware |
| + | └─ make_lambda_handler |
| + | |
| + | First request (via @app.before_request): |
| + | └─ GitStorage(REPOSITORY) # open git repo |
| + | └─ db.create_all() # DDL |
| + | └─ update_app_config() # DB query |
| + | └─ OtterwikiRenderer(config) # mistune + pygments + bs4 |
| + | └─ plugin_manager.hook.setup() # plugin initialization |
| + | └─ Mail(app) # flask-mail |
| + | ``` |
| + | |
| + | ### The Replacement Module |
| + | |
| + | `lambda_server.py` exports the same names as `otterwiki.server` but uses Werkzeug's `LocalProxy` for expensive singletons: |
| + | |
| + | ```python |
| + | import sys, os |
| + | from flask import Flask |
| + | from flask_sqlalchemy import SQLAlchemy |
| + | from werkzeug.local import LocalProxy |
| + | |
| + | # --- Cheap: created at import time (~300ms) --- |
| + | |
| + | app = Flask('otterwiki', |
| + | template_folder='<otterwiki package>/templates', |
| + | static_folder='<otterwiki package>/static') |
| + | app.config.update( |
| + | # ... same defaults as server.py lines 21-77 ... |
| + | ) |
| + | app.config.from_envvar("OTTERWIKI_SETTINGS", silent=True) |
| + | # env overrides (same loop as server.py lines 87-97) |
| + | |
| + | db = SQLAlchemy(app) |
| + | |
| + | # --- Expensive: deferred via LocalProxy --- |
| + | |
| + | def _get_or_init(name): |
| + | """Lazy-initialize expensive singletons on first access.""" |
| + | if not hasattr(app, f'_lazy_{name}'): |
| + | _do_deferred_init() |
| + | return getattr(app, f'_lazy_{name}') |
| + | |
| + | def _do_deferred_init(): |
| + | """One-shot initialization of all deferred components.""" |
| + | import otterwiki.gitstorage |
| + | from otterwiki.renderer import OtterwikiRenderer |
| + | from otterwiki.plugins import plugin_manager |
| + | from flask_mail import Mail |
| + | |
| + | app._lazy_storage = otterwiki.gitstorage.GitStorage(app.config["REPOSITORY"]) |
| + | app._lazy_app_renderer = OtterwikiRenderer(config=app.config) |
| + | app._lazy_mail = Mail(app) |
| + | |
| + | # DB init |
| + | with app.app_context(): |
| + | db.create_all() |
| + | from otterwiki.models import Preferences |
| + | for item in Preferences.query: |
| + | # same config-update logic as server.py lines 177-205 |
| + | app.config[item.name] = item.value |
| + | |
| + | # Plugin setup |
| + | plugin_manager.hook.setup( |
| + | app=app, storage=app._lazy_storage, db=db |
| + | ) |
| + | |
| + | # Git HTTP server (for completeness; may not be needed in Lambda) |
| + | import otterwiki.remote |
| + | app._lazy_githttpserver = otterwiki.remote.GitHttpServer( |
| + | path=app.config["REPOSITORY"] |
| + | ) |
| + | |
| + | storage = LocalProxy(lambda: _get_or_init('storage')) |
| + | app_renderer = LocalProxy(lambda: _get_or_init('app_renderer')) |
| + | mail = LocalProxy(lambda: _get_or_init('mail')) |
| + | githttpserver = LocalProxy(lambda: _get_or_init('githttpserver')) |
| + | |
| + | # Re-export models (helper.py imports Preferences from server via wildcard) |
| + | from otterwiki.models import Preferences, Drafts, User, Cache |
| + | |
| + | # Template filters (cheap, register at import time) |
| + | # ... same @app.template_filter definitions as server.py lines 239-305 ... |
| + | |
| + | # Jinja globals |
| + | app.jinja_env.globals.update(os_getenv=os.getenv) |
| + | |
| + | # --- Inject into sys.modules --- |
| + | sys.modules['otterwiki.server'] = sys.modules[__name__] |
| + | ``` |
| + | |
| + | ### Export Surface |
| + | |
| + | Names that `otterwiki.server` exports and other modules depend on: |
| + | |
| + | | Name | Type | Lazy? | Imported by | |
| + | |------|------|-------|-------------| |
| + | | `app` | Flask | No — must exist for decorators | everything | |
| + | | `db` | SQLAlchemy | No — must exist for Model class definitions | models, auth, helper, wiki, preferences | |
| + | | `storage` | GitStorage | **Yes** — nothing touches it until a request | helper, wiki, gitstorage, remote | |
| + | | `app_renderer` | OtterwikiRenderer | **Yes** — only used during markdown rendering | helper | |
| + | | `mail` | Flask-Mail | **Yes** — only used when sending email | helper | |
| + | | `githttpserver` | GitHttpServer | **Yes** — only used for git HTTP routes | views | |
| + | | `Preferences` | SQLAlchemy Model | Via re-export from models (needs `db`, not `storage`) | helper | |
| + | | `Drafts` | SQLAlchemy Model | Via re-export | wiki | |
| + | | `update_app_config` | function | Called internally, deferred | (internal) | |
| + | |
| + | `app` and `db` must be real objects at import time because other modules use them to define models (`class Preferences(db.Model)`) and register routes (`@app.route`). Everything else can be a `LocalProxy`. |
| + | |
| + | ## Upstream Contributions |
| + | |
| + | These changes benefit all Otterwiki deployments and make library mode cleaner. Listed in order of impact and likelihood of acceptance: |
| + | |
| + | ### 1. Lazy imports in `views.py` (highest value) |
| + | |
| + | **Current:** `views.py` imports the entire dependency tree at module level: |
| + | ```python |
| + | from otterwiki.wiki import Page, Changelog, Search, AutoRoute # triggers PIL, feedgen, unidiff, bs4 |
| + | from otterwiki.sitemap import sitemap as generate_sitemap |
| + | import otterwiki.auth # triggers flask_login, werkzeug.security |
| + | import otterwiki.preferences |
| + | import otterwiki.tools |
| + | ``` |
| + | |
| + | **Proposed:** Move imports into route handler function bodies: |
| + | ```python |
| + | @app.route("/<path:path>") |
| + | def view(path="Home"): |
| + | from otterwiki.wiki import AutoRoute |
| + | p = AutoRoute(path, values=request.values) |
| + | return p.view() |
| + | ``` |
| + | |
| + | Python caches imports, so only the first call to each handler pays the cost. This is a mechanical change — no logic changes, ~30 function edits. It keeps `import otterwiki.views` cheap (just decorator registration) and defers the heavy imports to first request. |
| + | |
| + | **Estimated savings:** ~500ms off import time. |
| + | |
| + | ### 2. Lazy imports in `wiki.py` |
| + | |
| + | **Current:** Top-level imports only used by specific methods: |
| + | ```python |
| + | import PIL.Image # only used in get_attachment_thumbnail() |
| + | from feedgen.feed import FeedGenerator # only used in feed_rss(), feed_atom() |
| + | # unidiff imported via helper.patchset2urlmap |
| + | ``` |
| + | |
| + | **Proposed:** Move to function bodies. |
| + | |
| + | **Estimated savings:** ~200ms off import time. |
| + | |
| + | ### 3. Extract plugin entrypoint scan |
| + | |
| + | **Current:** `plugins.py:269` calls `load_setuptools_entrypoints("otterwiki")` at module level, scanning all installed packages every time `otterwiki.plugins` is imported. |
| + | |
| + | **Proposed:** Add an `init_plugins()` function that performs the scan explicitly: |
| + | ```python |
| + | plugin_manager = pluggy.PluginManager("otterwiki") |
| + | plugin_manager.add_hookspecs(OtterWikiPluginSpec) |
| + | |
| + | def init_plugins(): |
| + | plugin_manager.load_setuptools_entrypoints("otterwiki") |
| + | ``` |
| + | |
| + | Callers (server.py, our lambda_server.py) call `init_plugins()` when ready. On a 180MB Lambda package with numpy, faiss, sqlalchemy, etc., the entrypoint scan is not cheap. |
| + | |
| + | **Estimated savings:** ~200ms, moved from import time to controlled init. |
| + | |
| + | ### 4. Remove duplicate renderer instance |
| + | |
| + | **Current:** `renderer.py:632` creates `render = OtterwikiRenderer()` — a second instance of the full mistune parser chain, used only by the about page and as an unconfigured test renderer. |
| + | |
| + | **Proposed:** Delete it. The about page can use `app_renderer` (or lazy-create). Tests can create their own instance. |
| + | |
| + | **Estimated savings:** ~200ms off import time. |
| + | |
| + | ### 5. App factory pattern (longer-term) |
| + | |
| + | Standard Flask best practice. Would replace the module-level globals with a `create_app()` function that returns a configured Flask app. This would make our `sys.modules` injection unnecessary — we'd just call `create_app(config)` with our own config. |
| + | |
| + | This is a larger conversation and a bigger change. The other four PRs are sufficient for library mode. |
| + | |
| + | ## Estimated Init Timeline |
| + | |
| + | ### Current |
| + | ``` |
| + | INIT ████████████████████████████████████████████ 4,400ms |
| + | ``` |
| + | |
| + | ### With library mode + upstream lazy imports (PRs 1-4) |
| + | ``` |
| + | INIT ████████ ~800ms |
| + | First request (one-time): ██████████████████ ~1,800ms |
| + | Total first response: ██████████████████████████ ~2,600ms |
| + | ``` |
| + | |
| + | ### With library mode only (no upstream changes) |
| + | ``` |
| + | INIT ████████████████ ~1,600ms |
| + | First request (one-time): ██████████████ ~1,400ms |
| + | Total first response: ██████████████████████████████ ~3,000ms |
| + | ``` |
| + | |
| + | ### Breakdown of savings |
| + | |
| + | | Change | Init savings | Where cost moves | Complexity | |
| + | |--------|-------------|-----------------|------------| |
| + | | `storage` → LocalProxy | ~200ms | First request | Low — 10 lines in lambda_server | |
| + | | `app_renderer` → LocalProxy | ~300ms | First request | Low — same pattern | |
| + | | Defer `db.create_all()` + config | ~300ms | First request | Low — before_request hook | |
| + | | Lazy imports in views.py (upstream) | ~500ms | First request handler | Medium — ~30 function edits | |
| + | | Lazy plugin loading | ~500ms | First request | Low — move 2 imports + init function | |
| + | | Lazy PIL/feedgen/unidiff (upstream) | ~200ms | First use of those routes | Low — move 3 imports | |
| + | | Remove duplicate renderer (upstream) | ~200ms | N/A (eliminated) | Low — delete 1 line | |
| + | | Defer multi-tenant middleware | ~350ms | First request | Low — already in our code | |
| + | |
| + | ## Tracking Upstream Compatibility |
| + | |
| + | The coupling surface is the set of names exported by `server.py` and the internal APIs of `GitStorage`, `OtterwikiRenderer`, and the SQLAlchemy models. Mitigation: |
| + | |
| + | 1. **CI job** that runs Otterwiki's existing test suite against our replacement server module. Any export surface change (new name added to server.py, model schema change) fails the tests. |
| + | 2. **Pin to upstream tags** in the fork, not HEAD. Review upstream changes at each version bump. |
| + | 3. **The export surface is stable.** `app`, `db`, `storage` have been the core exports since Otterwiki's early versions. Template filters and Jinja globals change occasionally but are easy to sync. |
| + | |
| + | The riskiest coupling is to `server.py`'s config defaults (lines 21-77) — if a new config key is added upstream, we need to add it to lambda_server.py. The CI job catches this because Otterwiki's tests exercise config-dependent behavior. |
| + | |
| + | ## Relationship to CDN Read Path |
| + | |
| + | These two designs are complementary: |
| + | |
| + | - **CDN Read Path** ([[Design/CDN_Read_Path]]) eliminates the heavy Lambda from the browser read path entirely. Reads are served by a thin assembly Lambda (<100ms cold start) or CloudFront cache. |
| + | - **Library Mode** (this document) reduces the heavy Lambda's cold start for the write path (MCP, API). From ~4.5s to ~2.6s (with upstream PRs) or ~3.0s (without). |
| + | |
| + | Together, they make the platform feel responsive: |
| + | - Browser reads: ~10-50ms (cache hit) or ~100-300ms (cache miss) |
| + | - MCP/API writes (warm): single-digit ms |
| + | - MCP/API writes (cold): ~2.6s first response, then warm for the session |
| + | |
| + | ## Open Questions |
| + | |
| + | 1. **Does `sys.modules` injection interact with pluggy's entrypoint scanner?** Pluggy uses `importlib.metadata` to find plugins, not `import`. Should be fine, but needs verification. |
| + | 2. **Does Flask-SQLAlchemy's `SQLAlchemy(app)` eagerly connect to the database?** If so, the DB file must exist at import time. May need `db.init_app(app)` pattern instead (deferred binding). |
| + | 3. **Can `db.create_all()` be safely called in `before_request`?** Flask-SQLAlchemy's `create_all` needs an app context. The `before_request` hook runs inside one, so this should work. |
| + | 4. **What happens if a `LocalProxy`-wrapped `storage` is accessed during module-level code in another otterwiki module?** Grep for module-level usage of `storage` outside of `server.py` to verify none exists. (Preliminary review: `gitstorage.py` defines `storage = None` at module level but doesn't import from server; `helper.py` imports it but only uses it in functions.) |
| + | 5. **Template filter registration timing.** Jinja2 template filters must be registered before the first template render. Registering them at import time in lambda_server.py (as server.py does) should be safe. |