Properties
category: reference tags: - benchmarks - lambda - cold-start - phase-2 last_updated: 2026-03-14
E-1: Cold Start Benchmarks
Date: 2026-03-14 Lambda: otterwiki (512MB default, VPC+EFS, ~180MB package) Package contents: otterwiki, numpy, faiss-cpu, sqlalchemy, bcrypt, cryptography, PyJWT, plugins
Summary
| Config | Memory | Init (mean) | Init (p50) | Init (min) | Init (max) | Wall (mean) | n |
|---|---|---|---|---|---|---|---|
| bare_vpc | 512MB | 88.1ms | 82.8ms | 79.6ms | 106.5ms | 561.7ms | 5 |
| minimal_vpc | 512MB | 86.0ms | 81.6ms | 80.7ms | 99.7ms | 580.6ms | 5 |
| heavy_imports | 512MB | 87.8ms | 86.4ms | 73.2ms | 100.7ms | 1079.4ms | 5 |
| otterwiki | 512MB | 4513.3ms | 4478.9ms | 4347.4ms | 4826.5ms | 5674.1ms | 5 |
| bare_vpc | 1024MB | 75.4ms | 75.7ms | 65.2ms | 84.5ms | 532.8ms | 5 |
| minimal_vpc | 1024MB | 78.4ms | 77.8ms | 75.0ms | 84.9ms | 539.6ms | 5 |
| heavy_imports | 1024MB | 75.8ms | 76.6ms | 70.8ms | 81.6ms | 920.9ms | 5 |
| otterwiki | 1024MB | 4274.2ms | 4379.7ms | 3707.2ms | 4492.8ms | 5232.5ms | 5 |
Control Lambdas:
bare_vpc— minimal handler, VPC+EFS, no imports beyond stdlibminimal_vpc— VPC+EFS, lightweight imports (os, json, time), EFS read/write testheavy_imports— full otterwiki package, NO VPC/EFS (SimpleLambdaComponent)
Detailed Init Breakdown (otterwiki @ 512MB)
| Phase | Time (ms) | % of Total |
|---|---|---|
| stdlib imports | 7.6 | 0.2% |
| ensure_directories | 0.0 | 0.0% |
| import apig_wsgi | 17.7 | 0.4% |
| import plugins | 528.0 | 12.0% |
| import otterwiki.server | 3490.8 | 79.4% |
| build_multi_tenant_app | 353.7 | 8.0% |
| make_lambda_handler | 0.0 | 0.0% |
| Total init | 4397.8 | 100% |
Detailed Init Breakdown (otterwiki @ 1024MB)
| Phase | Time (ms) | % of Total |
|---|---|---|
| stdlib imports | 6.2 | 0.1% |
| ensure_directories | 0.0 | 0.0% |
| import apig_wsgi | 15.3 | 0.4% |
| import plugins | 503.9 | 11.6% |
| import otterwiki.server | 3473.3 | 79.7% |
| build_multi_tenant_app | 357.0 | 8.2% |
| make_lambda_handler | 0.0 | 0.0% |
| Total init | 4355.9 | 100% |
Analysis
What's slow
from otterwiki.server import appis 79% of cold start (~3.5s). This triggers the entire Flask app initialization chain: SQLAlchemy model creation, Jinja2 template loading, pluggy hook execution, and transitive imports of all otterwiki modules (which pull in numpy, faiss, lxml, Pillow, etc.).Plugin imports are 12% (~500ms).
otterwiki_semantic_searchimports faiss and numpy at module level.otterwiki_apiis lighter but still contributes.Multi-tenant app build is 8% (~350ms). DynamoDB model instantiation and middleware wiring. This is new in Phase 2.
What's NOT slow
- VPC+EFS attach: ~80-90ms init + ~400ms wall overhead. The VPC ENI attachment adds about 400-500ms of wall time but only 80-90ms of billed init duration. This is well within acceptable range and NOT the bottleneck.
- stdlib imports: negligible (<10ms)
- apig_wsgi import: negligible (~17ms)
- Memory scaling: minimal benefit. Doubling from 512MB to 1024MB only reduces init by ~5% (4513→4274ms). The bottleneck is import/initialization, not memory or CPU.
Key insight
The heavy_imports Lambda (no VPC) shows init durations of only 75-88ms despite having the same 180MB package. This means the Lambda runtime loads the zip/S3 package quickly — the cost is in Python actually importing the modules and running their initialization code. The otterwiki.server import triggers a cascade of module-level work (Flask app factory, SQLAlchemy engine creation, template compilation, etc.) that takes 3.5s regardless of memory.
Comparison to Phase 0
- Phase 0 (2025): ~3.4s cold start at 512MB with a much smaller package
- Phase 2 (now): ~4.5s cold start at 512MB with 180MB package
- Delta: +1.1s from added dependencies (faiss, numpy, bcrypt, cryptography, multi-tenant middleware)
Recommendations
- Lazy imports in otterwiki.server — defer numpy/faiss/Pillow/lxml until first use
- Lazy plugin loading — don't import
otterwiki_semantic_searchat module level; register a pluggy hook that loads it on first search request - Consider SnapStart (if/when Python support lands) or provisioned concurrency for production
- Multi-tenant middleware — 350ms for DynamoDB model construction could be deferred to first request
- Memory increase not recommended — 1024MB provides negligible improvement for 2x cost
Raw Log Excerpts
512MB
COLD_START_TIMINGS: {"stdlib_imports": 7.55, "ensure_directories": 0.0, "import_apig_wsgi": 17.68, "import_plugins": 528.0, "import_otterwiki_server": 3490.79, "build_multi_tenant_app": 353.68, "make_lambda_handler": 0.01, "total_init": 4397.82, "multi_tenant": true}
REPORT Duration: 3.91 ms Billed Duration: 4483 ms Memory Size: 512 MB Max Memory Used: 218 MB Init Duration: 4478.90 ms
1024MB
COLD_START_TIMINGS: {"stdlib_imports": 6.24, "ensure_directories": 0.0, "import_apig_wsgi": 15.31, "import_plugins": 503.89, "import_otterwiki_server": 3473.34, "build_multi_tenant_app": 357.02, "make_lambda_handler": 0.01, "total_init": 4355.9, "multi_tenant": true}
REPORT Duration: 2.89 ms Billed Duration: 4441 ms Memory Size: 1024 MB Max Memory Used: 218 MB Init Duration: 4437.84 ms